Question 1

What is the CMU_ARCTIC dataset?

Accepted Answer

Dataset contains 1,150 utterances carefully selected from out-of-copyright texts from Project Gutenberg. The databases include US English male (bdl) and female (slt) speakers (both experinced voice talent) as well as other accented speakers.

Question 2

Is CMU_ARCTIC a benchmark?

Accepted Answer

Yes — CMU_ARCTIC is used as an LLM benchmark. See model leaderboards in the Benchmarks section.

Question 3

Where can I download CMU_ARCTIC?

Accepted Answer

CMU_ARCTIC is available at its source: http://festvox.org/cmu_arctic/.

CMU_ARCTIC

About CMU_ARCTIC

Details

Related Speech Recognition datasets

FAQ