Score: 1

End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation

Published: August 8, 2025 | arXiv ID: 2508.06387v2

By: Anurag Tripathi , Vaibhav Patle , Abhinav Jain and more

Potential Business Impact:

Finds the right database for your questions.

Text-to-SQL bridges the gap between natural language and structured database language, thus allowing non-technical users to easily query databases. Traditional approaches model text-to-SQL as a direct translation task, where a given Natural Language Query (NLQ) is mapped to an SQL command. Recent advances in large language models (LLMs) have significantly improved translation accuracy, however, these methods all require that the target database is pre-specified. This becomes problematic in scenarios with multiple extensive databases, where identifying the correct database becomes a crucial yet overlooked step. In this paper, we propose a three-stage end-to-end text-to-SQL framework to identify the user's intended database before generating SQL queries. Our approach leverages LLMs and prompt engineering to extract implicit information from natural language queries (NLQs) in the form of a ruleset. We then train a large db\_id prediction model, which includes a RoBERTa-based finetuned encoder, to predict the correct Database identifier (db\_id) based on both the NLQ and the LLM-generated rules. Finally, we refine the generated SQL by using critic agents to correct errors. Experimental results demonstrate that our framework outperforms the current state-of-the-art models in both database intent prediction and SQL generation accuracy.

End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation

Machine Learning (CS)

Finds the right database for your questions.

8 Aug 2025 1

92%

Exploring the Landscape of Text-to-SQL with Large Language Models: Progresses, Challenges and Opportunities

Computation and Language

Lets computers answer questions from data.

28 May 2025 0

91%

MageSQL: Enhancing In-context Learning for Text-to-SQL Applications with Large Language Models

Databases

Helps computers understand questions to find data.

2 Apr 2025 2

View PDF Login to Bookmark

Country of Origin

🇸🇬 Singapore

Page Count

9 pages

End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation

Finds the right database for your questions.

Technical Abstract

End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation

Exploring the Landscape of Text-to-SQL with Large Language Models: Progresses, Challenges and Opportunities

MageSQL: Enhancing In-context Learning for Text-to-SQL Applications with Large Language Models