amphion/Emilia-Dataset
Text To SpeechAutomatic Speech RecognitionZH, EN, JAcc-by-4.0
The amphion/Emilia-Dataset dataset is a ZH, EN, JA text to speech resource from amphion at 2024. With 79.3K downloads and 460 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a 10M<n<100M-scale dataset.
About amphion/Emilia-Dataset
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
This is the official repository 👑 for the Emilia dataset and the source code for the Emilia-Pipe speech data preprocessing pipeline.
Ne...
Details
- Task
- Text To Speech, Automatic Speech Recognition
- Language
- ZH, EN, JA
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- amphion
- Year
- 2024
- License
- cc-by-4.0
- Downloads
- 79279
- Likes
- 460