Tags
- A
- Abstract algebra
- Adam Roberts
- Ambrosio Blanco
- Andy Zou
- Batch size
- Blanco Colin
- Bradley Kyle
- Charles Foster
- Chung Constant
- Chung Hou
- Chung Yi
- Colin Clement
- Collin Burns
- Daya Guo
- Denny Zhou
- Encoder-decoder
- Encoder-decoder architecture
- Eric Hallahan
- Evaluations
- Fact
- Field extension
- Following
- Guo Shuo
- Hallahan
- Herbie Bradley
- Hou Longpre
- Hou Shayne
- Huang Alexey
- Hugging Face
- Jacob Steinhardt
- Jonathan Tow
- Junjie Huang
- Large language model
- Leo Jonathan
- Likelihood-ratio test
- Mantas Mazeika
- Mažeika
- Model evaluation
- Model performance
- Nathan Scales
- New
- Orhan Firat
- Performance gains
- Pretraining
- Quentin Anthony
- Ren Junjie
- Research area
- Sebastian Gehrmann
- Sharan Narang
- Shayne Longpre
- Shuai Daya
- Shuo Ren
- Sid Black
- Stella Biderman
- Stella Hailey
- Tay Denny
- Tay Hyung
- Tay William
- Training
- Uddeholms AB
- Vu Albert
- Yi Tay
- Zou Manta
- Zou Mantas