Score: 0

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Published: July 16, 2025 | arXiv ID: 2507.12415v1

By: Xinyi He , Qian Liu , Mingzhe Du and more

Potential Business Impact:

Makes computer programs run much faster.

Business Areas:

Application Performance Management Data and Analytics, Software

Code performance optimization is paramount in real-world software engineering and critical for production-level systems. While Large Language Models (LLMs) have demonstrated impressive capabilities in code generation and bug fixing, their proficiency in enhancing code performance at the repository level remains largely unexplored. To address this gap, we introduce SWE-Perf, the first benchmark specifically designed to systematically evaluate LLMs on code performance optimization tasks within authentic repository contexts. SWE-Perf comprises 140 carefully curated instances, each derived from performance-improving pull requests from popular GitHub repositories. Each benchmark instance includes the relevant codebase, target functions, performance-related tests, expert-authored patches, and executable environments. Through a comprehensive evaluation of representative methods that span file-level and repo-level approaches (e.g., Agentless and OpenHands), we reveal a substantial capability gap between existing LLMs and expert-level optimization performance, highlighting critical research opportunities in this emerging field.

SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?

Software Engineering

Helps computers fix slow code automatically.

8 Nov 2025 1

95%

SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?

Software Engineering

Helps computers fix slow code automatically.

8 Nov 2025 1

91%

SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories

Software Engineering

Teaches computers to fix and add code.

19 Dec 2025 0

View PDF Login to Bookmark

Page Count

22 pages

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Makes computer programs run much faster.

Technical Abstract

SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?

SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?

SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories