Score: 0

Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward

Published: May 18, 2025 | arXiv ID: 2505.12380v2

By: Han Weng , Puzhen Wu , Cui Longjie and more

Potential Business Impact:

Makes computers write better database questions.

Business Areas:

Text Analytics Data and Analytics, Software

Reinforcement learning (RL) has been widely adopted to enhance the performance of large language models (LLMs) on Text-to-SQL tasks. However, existing methods often rely on execution-based or LLM-based Bradley-Terry reward models. The former suffers from high execution latency caused by repeated database calls, whereas the latter imposes substantial GPU memory overhead, both of which significantly hinder the efficiency and scalability of RL pipelines. To this end, we propose a novel Text-to-SQL RL fine-tuning framework named Graph-Reward-SQL, which employs the GMNScore outcome reward model. We leverage SQL graph representations to provide accurate reward signals while significantly reducing inference time and GPU memory usage. Building on this foundation, we further introduce StepRTM, a stepwise reward model that provides intermediate supervision over Common Table Expression (CTE) subqueries. This encourages both functional correctness and structural clarity of SQL. Extensive comparative and ablation experiments on standard benchmarks, including Spider and BIRD, demonstrate that our method consistently outperforms existing reward models.

Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards

Computation and Language

Makes computers better at answering questions from data.

7 May 2025 2

92%

Reinforcing Code Generation: Improving Text-to-SQL with Execution-Based Learning

Computation and Language

Teaches computers to write correct database answers.

6 Jun 2025 0

90%

Beyond Query-Level Comparison: Fine-Grained Reinforcement Learning for Text-to-SQL with Automated Interpretable Critiques

Computation and Language

Teaches computers to understand database questions better.

27 Nov 2025 1

View PDF Login to Bookmark

Page Count

25 pages

Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward

Makes computers write better database questions.

Technical Abstract

Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards

Reinforcing Code Generation: Improving Text-to-SQL with Execution-Based Learning

Beyond Query-Level Comparison: Fine-Grained Reinforcement Learning for Text-to-SQL with Automated Interpretable Critiques