Score: 1

LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models

Published: April 3, 2025 | arXiv ID: 2504.02327v1

By: Weibin Liao , Xin Gao , Tianyu Jia and more

Potential Business Impact:

Lets computers understand database questions better.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Natural Language to SQL (NL2SQL) has emerged as a critical task for enabling seamless interaction with databases. Recent advancements in Large Language Models (LLMs) have demonstrated remarkable performance in this domain. However, existing NL2SQL methods predominantly rely on closed-source LLMs leveraging prompt engineering, while open-source models typically require fine-tuning to acquire domain-specific knowledge. Despite these efforts, open-source LLMs struggle with complex NL2SQL tasks due to the indirect expression of user query objectives and the semantic gap between user queries and database schemas. Inspired by the application of reinforcement learning in mathematical problem-solving to encourage step-by-step reasoning in LLMs, we propose LearNAT (Learning NL2SQL with AST-guided Task Decomposition), a novel framework that improves the performance of open-source LLMs on complex NL2SQL tasks through task decomposition and reinforcement learning. LearNAT introduces three key components: (1) a Decomposition Synthesis Procedure that leverages Abstract Syntax Trees (ASTs) to guide efficient search and pruning strategies for task decomposition, (2) Margin-aware Reinforcement Learning, which employs fine-grained step-level optimization via DPO with AST margins, and (3) Adaptive Demonstration Reasoning, a mechanism for dynamically selecting relevant examples to enhance decomposition capabilities. Extensive experiments on two benchmark datasets, Spider and BIRD, demonstrate that LearNAT enables a 7B-parameter open-source LLM to achieve performance comparable to GPT-4, while offering improved efficiency and accessibility.

DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction

Artificial Intelligence

Lets anyone ask computers questions using normal words.

18 Sep 2025 1

89%

SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Databases

Lets computers understand your questions for data.

11 Apr 2025 1

88%

Evaluating NL2SQL via SQL2NL

Computation and Language

Makes AI better understand different ways of asking questions.

4 Sep 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

14 pages

LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models

Lets computers understand database questions better.

Technical Abstract

DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction

SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Evaluating NL2SQL via SQL2NL