Classify and extract text 10x better and faster 🦾

NLP Models Repository

List of Natural Language Processing Models to use in your AI projects

Transformers models covering tasks from Summarization, Text Classification, to Text Generation. We hope you find this library useful in your development endeavors

Model Name	Description	Task
distilbert-base-uncased	DistilBERT is a transformers model, smaller and faster than BERT, which was pretrained on the same corpus in a self-supervised fashion, using the BERT base model as a teacher . This model is uncased: it does not make a difference between English and English . The model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering . For tasks such as text-generation you should look at model like GPT2. For tasks like text-language-generation, you should . use the raw model for either masked language modeling or next sentence,
bert-base-uncased	BERT is a self-supervised model on English language using a masked language modeling (MLM) objective . This model is uncased: it does not make a difference between English and English . It's mostly intended to be fine-tuned on downstream tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering . The team releasing BERT did not write a model card for this model so this model card has been written by the Hugging Face team . For tasks such as text-generation you should look at model like GPT2. For example, the model was pretrained on the raw texts only, with no,
bert-base-cased	BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion . This model is case-sensitive: it makes a difference between English and English . It's mostly intended to be fine-tuned on downstream tasks that use the whole sentence (potentially masked)to make decisions, such as sequence classification, token classification or question answering . The team releasing BERT did not write a model card for this model so this model card has been written by the Hugging Face team . It was introduced in a paper and first released in the PePean repository . The model was pretrained with two objectives: Masked language modeling (MLM),
cl-tohoku/bert-base-japanese-whole-word-masking	This is a BERT model pretrained on texts in the Japanese language . It processes input texts with word-level tokenization based on the IPA dictionary, followed by the WordPiece subword tokenization . The model is trained with the whole word masking enabled for the masked language modeling (MLM) objective . The code for the pretraining is available at cl-tohoku/bert-japanese. The training corpus is 2.6GB in size, consisting of approximately 17M sentences . The vocabulary size is 32000 words, 32000 Japanese words, and 32,000 words . The models are distributed under the terms of the Creative Commons Attribution-ShareAlike 3,
jplu/tf-xlm-roberta-base	XLM-RoBERTa is a scaled cross lingual sentence encoder . XLM-R achieves state-of-the-arts results on multiple cross-lingual benchmarks . All models are available on the Huggingface model hub for Tensorflow . The model is trained on 2.5T of data across 100 languages data filtered from Common Crawl . The models can be loaded like: TFXLMRobertaModel.from_pretrained("jplu/tf-xlm-roberta-base") or TF_model.h5/TF-model.json • TF_Model.html • TF-Model.hs/tf_model,
microsoft/codebert-base	Pretrained weights for CodeBERT: A Pre-Trained Model for Programming and Natural Languages . The model is trained on bi-modal data (documents & code) of CodeSearchNet . It is trained with MLM+RTD objective (cf. the paper) Please see the official repository for scripts that support "code search" and "code-to-document generation" for "code to document generation" The author of the paper is Zhangyin Feng and Daya Guo and . Duyu Tang and . Nan Duan and Xiaocheng Feng and . Xiaoyng Feng . Ming Gong and Linjun Shou and . Bing Qin and Ting,
xlm-roberta-base	This article includes a collection of photos from the World War II era . In the U.S. it has been published in print for more than 100 years .
roberta-large	This model doesn't have a description yet. Ask author for a proper description.
bert-large-uncased	BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion . This model is uncased: it does not make a difference between English and English . It was introduced in a paper and first released in the repository of BERT . The model was pretrained with two objectives: Masked language modeling (MLM) and next sentence prediction (NSP) This way, the model learns an inner representation of the English language that can then be used to extract features that can be used for downstream tasks like sequence classification, token classification or question answering . For tasks such as text-generation you should look at model like GPT2. For tasks,
gpt2	GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion . It was trained to guess the next word in sentences using a causal language modeling objective . The model learns an inner representation of the English language that can then be used to extract features that are useful for downstream tasks . You can use the raw model for text generation or fine-tune it to a downstream task. See the model hub to look for fine-tuned versions on a task that interests you. The model is best at what it was pretrained for however, which is generating texts from a pre-training corpus of text . The team releasing GPT,

Was this page helpful? Share to help more people.

Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.

Book a demo