Agentic NL2SQL to Reduce Computational Costs
By: Dominik Jehle, Lennart Purucker, Frank Hutter
Potential Business Impact:
Lets computers find data faster and cheaper.
Translating natural language queries into SQL queries (NL2SQL or Text-to-SQL) has recently been empowered by large language models (LLMs). Using LLMs to perform NL2SQL methods on a large collection of SQL databases necessitates processing large quantities of meta-information about the databases, which in turn results in lengthy prompts with many tokens and high processing costs. To address this challenge, we introduce Datalake Agent, an agentic system designed to enable an LLM to solve NL2SQL tasks more efficiently. Instead of utilizing direct solvers for NL2SQL that call the LLM once with all meta-information in the prompt, the Datalake Agent employs an interactive loop to reduce the utilized meta-information. Within the loop, the LLM is used in a reasoning framework that selectively requests only the necessary information to solve a table question answering task. We evaluate the Datalake Agent on a collection of 23 databases with 100 table question answering tasks. The Datalake Agent reduces the tokens used by the LLM by up to 87\% and thus allows for substantial cost reductions while maintaining competitive performance.
Similar Papers
From Queries to Insights: Agentic LLM Pipelines for Spatio-Temporal Text-to-SQL
Artificial Intelligence
Helps computers answer questions about places and times.
LLM/Agent-as-Data-Analyst: A Survey
Artificial Intelligence
Computers understand and analyze any kind of data.
LLM/Agent-as-Data-Analyst: A Survey
Artificial Intelligence
Computers understand and analyze all kinds of data.