SRU++ Large
ASAPPLanguage modelingOpen weights
SRU++ Large is language modeling model published by ASAPP in 2021 featuring 234000000.0 parameters.
About SRU++ Large
Large language models have become increasingly difficult to train because of the growing computation time and cost. In this work, we present SRU++, a highly-efficient architecture that combines fast recurrence and attention for sequence modeling. SRU
Details
- Provider
- ASAPP
- Task
- Language modeling
- Parameters
- 234000000.0
- Released
- 2021-02-24
- Open weights
- Yes