Genia Dataset
Created by Kim et al. at 2003, the Genia Dataset contains 1,999 Medline abstracts, selected using a PubMed query for the three MeSH terms "human", "blood cells", and "transcription factors". The corpus has been annotated for part-of-speech, contituency syntactic, terms, events, relations, and coreference., in English language. Containing 1,999 in Text, XML file format.
Dataset Sources
Here you can download the Genia dataset in Text, XML format.
Download Genia dataset Text, XML files
Fine-tune with Genia dataset
Metatext is a powerful no-code tool for train, tune and integrate custom NLP models
Paper
Read full original Genia paper.
Classify and extract text 10x better and faster 🦾
Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.