ATH-MaaS/Marco_Longspeech
Automatic Speech RecognitionAudio ClassificationText GenerationEN, ZHBenchmarkapache-2.0
ATH-MaaS/Marco_Longspeech is a automatic speech recognition benchmark dataset in EN, ZH from ATH-MaaS in Parquet format. It is distributed under the apache-2.0 license and falls in the 10K<n<100K size category, and has been downloaded 11.8K times.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About ATH-MaaS/Marco_Longspeech
Marco-LongSpeech Dataset
Marco-LongSpeech is a multi-task long speech understanding dataset containing 8 different speech understanding tasks designed to benchmark Large Language Models on lengthy audio inputs.
📊 Dataset Stati...
Details
- Task
- Automatic Speech Recognition, Audio Classification, Text Generation
- Language
- EN, ZH
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10K<n<100K
- Creator
- ATH-MaaS
- Year
- 2026
- License
- apache-2.0
- Downloads
- 11759
- Likes
- 18