Skip to content

ScaleAI/SWE-bench_Pro

General NLPEnglishBenchmark

Created by ScaleAI at 2025, the ScaleAI/SWE-bench_Pro is a General NLP benchmark dataset in English in Parquet format.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About ScaleAI/SWE-bench_Pro

Dataset Summary SWE-Bench Pro is a challenging, enterprise-level dataset for testing agent ability on long-horizon software engineering tasks. Paper: https://static.scale.com/uploads/654197dc94d34f66c0f5184e/SWEAP_Eval_Scale%20(9).pdf See the r...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
ScaleAI
Year
2025
Download

Related General NLP datasets

FAQ