We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent f69a87f commit 3ff2f1fCopy full SHA for 3ff2f1f
README.md
@@ -13,7 +13,7 @@ Fundamental research to develop new architectures for foundation models and A(G)
13
- Capability - A [**Length-Extrapolatable**](https://arxiv.org/abs/2212.10554) Transformer
14
- Efficiency - [**X-MoE**](https://arxiv.org/abs/2204.09179): scalable & finetunable sparse Mixture-of-Experts (MoE)
15
16
-### Revolutionizing Transformers for (M)LLMs and AI
+### The Revolution of Model Architecture
17
- [**BitNet**](https://arxiv.org/abs/2310.11453): 1-bit Transformers for Large Language Models
18
- [**RetNet**](https://arxiv.org/abs/2307.08621): Retentive Network: A Successor to Transformer for Large Language Models
19
- [**LongNet**](https://arxiv.org/abs/2307.02486): Scaling Transformers to 1,000,000,000 Tokens
0 commit comments