Score: 0

Reinforcing Code Generation: Improving Text-to-SQL with Execution-Based Learning

Published: June 6, 2025 | arXiv ID: 2506.06093v1

By: Atharv Kulkarni, Vivek Srikumar

Potential Business Impact:

Teaches computers to write correct database answers.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

In this work, we study the problem of code generation with a large language model (LLM), with a focus on generating SQL queries from natural language questions. We ask: Instead of using supervised fine tuning with text-code pairs, can we tune a model by having it interact with a database engine? We frame this problem as a reinforcement learning problem where the model receives execution-based feedback from the environment in the form of scalar rewards. These rewards penalize execution failures and assign positive values when a query returns a correct answer. We use the rewards within the Group Relative Policy Optimization (GRPO) framework. We use a tabular reasoning benchmark to test and evaluate our findings. We find that with only weak supervision in the form of question-answer pairs, RL-tuning improves the accuracy of model generated SQL code from 31.49 to 49.83 while reducing error percentage from 25.43% to 14.71%. This improvement allowed the model nearly match the performance performance to the larger SQLCoder-70B model. Our work demonstrates the potential of using execution-based feedback to improve symbolic reasoning capabilities of LLMs.

Sparks of Tabular Reasoning via Text2SQL Reinforcement Learning

Computation and Language

Teaches computers to understand and use data tables.

23 Apr 2025 0

92%

ConstrainedSQL: Training LLMs for Text2SQL via Constrained Reinforcement Learning

Machine Learning (CS)

Teaches computers to answer questions from data better.

12 Nov 2025 1

92%

Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward

Machine Learning (CS)

Makes computers write better database questions.

18 May 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

12 pages

Reinforcing Code Generation: Improving Text-to-SQL with Execution-Based Learning

Teaches computers to write correct database answers.

Technical Abstract

Sparks of Tabular Reasoning via Text2SQL Reinforcement Learning

ConstrainedSQL: Training LLMs for Text2SQL via Constrained Reinforcement Learning

Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward