Question 1

What is the 1.5 billion Words Arabic Corpus dataset?

Accepted Answer

The data were collected from newspaper articles in ten major news sources from eight Arabic countries, over a period of fourteen years.

Question 2

Is 1.5 billion Words Arabic Corpus a benchmark?

Accepted Answer

1.5 billion Words Arabic Corpus is a dataset for training or evaluation; it isn't tracked as a standard LLM benchmark in our catalog.

Question 3

Where can I download 1.5 billion Words Arabic Corpus?

Accepted Answer

1.5 billion Words Arabic Corpus is available at its source: http://www.abuelkhair.net/index.php/en/arabic/abu-el-khair-corpus.

1.5 billion Words Arabic Corpus

About 1.5 billion Words Arabic Corpus