Score: 1

ConfClip: Confidence-Weighted and Clipped Reward for Reinforcement Learning in LLMs

Published: September 22, 2025 | arXiv ID: 2509.17730v1

By: Bonan Zhang , Zhongqi Chen , Bowen Song and more

Potential Business Impact:

Makes AI better at explaining things by trusting its own answers.

Business Areas:

A/B Testing Data and Analytics

Reinforcement learning (RL) has become a standard paradigm for refining large language models (LLMs) beyond pre-training and instruction tuning. A prominent line of work is RL with verifiable rewards (RLVR), which leverages automatically verifiable outcomes (e.g., correctness or executability) to generate reward signals. While efficient, this framework faces two key limitations: First, its binary feedback is too sparse to capture the quality of the reasoning process. Second, its coarse-grained rewards potentially lead to vanishing gradients. Inspired by observations from human learning, we introduce a RL technique that integrates verifiable outcomes with the model's own confidence estimates. This joint design enriches the reward signal, providing finer-grained feedback and implicitly supervising the reasoning process. Experimental results demonstrate that our proposed method enhances RL performance across multiple datasets and reduces token consumption during inference, while incurring negligible additional training cost. Moreover, it can be used as a plug-in module to enhance other state-of-the-art RL methods.

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Computation and Language

Teaches computers to solve math problems better.

5 Jun 2025 0

90%

Rewarding Doubt: A Reinforcement Learning Approach to Calibrated Confidence Expression of Large Language Models

Computation and Language

Makes AI tell you when it's sure or guessing.

4 Mar 2025 0

90%

Masked-and-Reordered Self-Supervision for Reinforcement Learning from Verifiable Rewards

Computation and Language

Teaches computers to solve math problems better.

21 Nov 2025 0

View PDF Login to Bookmark

Page Count

5 pages

ConfClip: Confidence-Weighted and Clipped Reward for Reinforcement Learning in LLMs

Makes AI better at explaining things by trusting its own answers.

Technical Abstract

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Rewarding Doubt: A Reinforcement Learning Approach to Calibrated Confidence Expression of Large Language Models

Masked-and-Reordered Self-Supervision for Reinforcement Learning from Verifiable Rewards