Score: 1

Distributional Clarity: The Hidden Driver of RL-Friendliness in Large Language Models

Published: January 11, 2026 | arXiv ID: 2601.06911v1

By: Shaoning Sun , Mingzhu Cai , Huang He and more

BigTech Affiliations: Baidu

Potential Business Impact:

Makes AI better at learning and fixing mistakes.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Language model families exhibit striking disparity in their capacity to benefit from reinforcement learning: under identical training, models like Qwen achieve substantial gains, while others like Llama yield limited improvements. Complementing data-centric approaches, we reveal that this disparity reflects a hidden structural property: \textbf{distributional clarity} in probability space. Through a three-stage analysis-from phenomenon to mechanism to interpretation-we uncover that RL-friendly models exhibit intra-class compactness and inter-class separation in their probability assignments to correct vs. incorrect responses. We quantify this clarity using the \textbf{Silhouette Coefficient} ($S$) and demonstrate that (1) high $S$ correlates strongly with RL performance; (2) low $S$ is associated with severe logic errors and reasoning instability. To confirm this property, we introduce a Silhouette-Aware Reweighting strategy that prioritizes low-$S$ samples during training. Experiments across six mathematical benchmarks show consistent improvements across all model families, with gains up to 5.9 points on AIME24. Our work establishes distributional clarity as a fundamental, trainable property underlying RL-Friendliness.

Country of Origin
🇨🇳 China

Page Count
15 pages

Category
Computer Science:
Computation and Language