Score: 2

Unveiling Challenges for LLMs in Enterprise Data Engineering

Published: April 15, 2025 | arXiv ID: 2504.10950v1

By: Jan-Micha Bodensohn , Ulf Brackmann , Liane Vogel and more

Potential Business Impact:

Helps computers sort big company data faster.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large Language Models (LLMs) have demonstrated significant potential for automating data engineering tasks on tabular data, giving enterprises a valuable opportunity to reduce the high costs associated with manual data handling. However, the enterprise domain introduces unique challenges that existing LLM-based approaches for data engineering often overlook, such as large table sizes, more complex tasks, and the need for internal knowledge. To bridge these gaps, we identify key enterprise-specific challenges related to data, tasks, and background knowledge and conduct a comprehensive study of their impact on recent LLMs for data engineering. Our analysis reveals that LLMs face substantial limitations in real-world enterprise scenarios, resulting in significant accuracy drops. Our findings contribute to a systematic understanding of LLMs for enterprise data engineering to support their adoption in industry.

Repos / Data Links

Page Count
14 pages

Category
Computer Science:
Databases