Large Language Models in the Data Science Lifecycle: A Systematic Mapping Study
By: Sai Sanjna Chintakunta, Nathalia Nascimento, Everton Guimaraes
Potential Business Impact:
Helps computers do data science tasks better.
In recent years, Large Language Models (LLMs) have emerged as transformative tools across numerous domains, impacting how professionals approach complex analytical tasks. This systematic mapping study comprehensively examines the application of LLMs throughout the Data Science lifecycle. By analyzing relevant papers from Scopus and IEEE databases, we identify and categorize the types of LLMs being applied, the specific stages and tasks of the data science process they address, and the methodological approaches used for their evaluation. Our analysis includes a detailed examination of evaluation metrics employed across studies and systematically documents both positive contributions and limitations of LLMs when applied to data science workflows. This mapping provides researchers and practitioners with a structured understanding of the current landscape, highlighting trends, gaps, and opportunities for future research in this rapidly evolving intersection of LLMs and data science.
Similar Papers
More Parameters Than Populations: A Systematic Literature Review of Large Language Models within Survey Research
Digital Libraries
Helps surveys use AI to gather and understand information.
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
Computation and Language
AI helps scientists discover new things faster.
Large Language Model-based Data Science Agent: A Survey
Artificial Intelligence
Lets computers help scientists analyze data.