Transformer-XL (257M)
Carnegie Mellon University (CMU)Google BrainLanguage modeling/generationOpen weights
Transformer-XL (257M) is a language modeling/generation model from Carnegie Mellon University (CMU),Google Brain released in 2019 with 256999999.99999997 parameters.
About Transformer-XL (257M)
Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed len
Details
- Provider
- Carnegie Mellon University (CMU),Google Brain
- Task
- Language modeling/generation
- Parameters
- 256999999.99999997
- Released
- 2019-01-09
- Open weights
- Yes