Skip to content

amazon-agi/SIFT-50M

Audio Text To TextAudio ClassificationText To SpeechAudio To AudioEN, DE, FR

Amazon-agi/SIFT-50M is a audio text to text dataset in EN, DE, FR from amazon-agi in Parquet format. It is distributed under the cdla-sharing-1.0 license and falls in the 10M<n<100M size category, and has been downloaded 7.6K times.

About amazon-agi/SIFT-50M

Dataset Card for SIFT-50M SIFT-50M (Speech Instruction Fine-Tuning) is a 50-million-example dataset designed for instruction fine-tuning and pre-training of speech-text large language models (LLMs). It is built from publicly available speech co...

Details

Task
Audio Text To Text, Audio Classification, Text To Speech, Audio To Audio
Language
EN, DE, FR
Format
Parquet
Rows / instances
N/A
Size
10M<n<100M
Creator
amazon-agi
Year
2025
License
cdla-sharing-1.0
Downloads
7609
Likes
38
Download Homepage

FAQ