Score: 0

Large Language Models in the Data Science Lifecycle: A Systematic Mapping Study

Published: August 12, 2025 | arXiv ID: 2508.11698v1

By: Sai Sanjna Chintakunta, Nathalia Nascimento, Everton Guimaraes

Potential Business Impact:

Helps computers do data science tasks better.

In recent years, Large Language Models (LLMs) have emerged as transformative tools across numerous domains, impacting how professionals approach complex analytical tasks. This systematic mapping study comprehensively examines the application of LLMs throughout the Data Science lifecycle. By analyzing relevant papers from Scopus and IEEE databases, we identify and categorize the types of LLMs being applied, the specific stages and tasks of the data science process they address, and the methodological approaches used for their evaluation. Our analysis includes a detailed examination of evaluation metrics employed across studies and systematically documents both positive contributions and limitations of LLMs when applied to data science workflows. This mapping provides researchers and practitioners with a structured understanding of the current landscape, highlighting trends, gaps, and opportunities for future research in this rapidly evolving intersection of LLMs and data science.