Score: 1

Reflecting on Empirical and Sustainability Aspects of Software Engineering Research in the Era of Large Language Models

Published: October 30, 2025 | arXiv ID: 2510.26538v1

By: David Williams , Max Hort , Maria Kechagia and more

Potential Business Impact:

Improves how we test and use AI in computer programs.

Business Areas:

Software Engineering Science and Engineering, Software

Software Engineering (SE) research involving the use of Large Language Models (LLMs) has introduced several new challenges related to rigour in benchmarking, contamination, replicability, and sustainability. In this paper, we invite the research community to reflect on how these challenges are addressed in SE. Our results provide a structured overview of current LLM-based SE research at ICSE, highlighting both encouraging practices and persistent shortcomings. We conclude with recommendations to strengthen benchmarking rigour, improve replicability, and address the financial and environmental costs of LLM-based SE.

Guidelines for Empirical Studies in Software Engineering involving Large Language Models

Software Engineering

Makes computer studies easier to check and repeat.

21 Aug 2025 2

92%

Guidelines for Empirical Studies in Software Engineering involving Large Language Models

Software Engineering

Makes computer studies easier to check and repeat.

21 Aug 2025 2

92%

Large Language Models for Software Engineering: A Reproducibility Crisis

Software Engineering

Makes science experiments with AI easier to repeat.

29 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇦🇺 🇬🇧 🇬🇷 Greece, United Kingdom, Australia

Page Count

5 pages

Reflecting on Empirical and Sustainability Aspects of Software Engineering Research in the Era of Large Language Models

Improves how we test and use AI in computer programs.

Technical Abstract

Guidelines for Empirical Studies in Software Engineering involving Large Language Models

Guidelines for Empirical Studies in Software Engineering involving Large Language Models

Large Language Models for Software Engineering: A Reproducibility Crisis