Question 1

What is the LibriVoxDeEn dataset?

Accepted Answer

Dataset contains sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences.

Question 2

Is LibriVoxDeEn a benchmark?

Accepted Answer

LibriVoxDeEn is a dataset for training or evaluation; it isn't tracked as a standard LLM benchmark in our catalog.

Question 3

Where can I download LibriVoxDeEn?

Accepted Answer

LibriVoxDeEn is available at its source: https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/TMEDTX.

LibriVoxDeEn

About LibriVoxDeEn

Details

FAQ