Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
By: Anisha Gunjal , Anthony Wang , Elaine Lau and more
Potential Business Impact:
Teaches computers to follow rules better.
Extending Reinforcement Learning with Verifiable Rewards (RLVR) to real-world tasks often requires balancing objective and subjective evaluation criteria. However, many such tasks lack a single, unambiguous ground truth-making it difficult to define reliable reward signals for post-training language models. While traditional preference-based methods offer a workaround, they rely on opaque reward functions that are difficult to interpret and prone to spurious correlations. We introduce $\textbf{Rubrics as Rewards}$ (RaR), a framework that uses structured, checklist-style rubrics as interpretable reward signals for on-policy training with GRPO. Our best RaR method yields up to a $28\%$ relative improvement on HealthBench-1k compared to simple Likert-based approaches, while matching or surpassing the performance of reward signals derived from expert-written references. By treating rubrics as structured reward signals, we show that RaR enables smaller-scale judge models to better align with human preferences and sustain robust performance across model scales.
Similar Papers
Reinforcement Learning with Rubric Anchors
Artificial Intelligence
Teaches AI to write better, more human-like stories.
RubricRL: Simple Generalizable Rewards for Text-to-Image Generation
CV and Pattern Recognition
Makes AI art follow your exact instructions better.
AutoRubric-R1V: Rubric-Based Generative Rewards for Faithful Multimodal Reasoning
Computation and Language
Teaches AI to think step-by-step, not just guess.