Score: 0

Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey

Published: October 2, 2025 | arXiv ID: 2510.01925v2

By: Qiyuan Liu , Hao Xu , Xuhong Chen and more

Potential Business Impact:

Makes smart computer programs think better.

Business Areas:

Machine Learning Artificial Intelligence, Data and Analytics, Software

Reward models (RMs) play a critical role in enhancing the reasoning performance of LLMs. For example, they can provide training signals to finetune LLMs during reinforcement learning (RL) and help select the best answer from multiple candidates during inference. In this paper, we provide a systematic introduction to RMs, along with a comprehensive survey of their applications in LLM reasoning. We first review fundamental concepts of RMs, including their architectures, training methodologies, and evaluation techniques. Then, we explore their key applications: (1) guiding generation and selecting optimal outputs during LLM inference, (2) facilitating data synthesis and iterative self-improvement for LLMs, and (3) providing training signals in RL-based finetuning. Finally, we discuss critical open questions regarding the selection, generalization, evaluation, and enhancement of RMs, based on existing research and our own empirical findings. Our analysis aims to provide actionable insights for the effective deployment and advancement of RMs for LLM reasoning.

RM-R1: Reward Modeling as Reasoning

Computation and Language

Makes AI explain its answers better.

5 May 2025 2

91%

Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models

Artificial Intelligence

Teaches AI to understand pictures and words together.

30 Apr 2025 1

90%

Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle

Computation and Language

Teaches computers to think and follow instructions better.

20 Sep 2025 1

View PDF Login to Bookmark

Country of Origin

🇭🇰 Hong Kong

Page Count

24 pages

Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey

Makes smart computer programs think better.

Technical Abstract

RM-R1: Reward Modeling as Reasoning

Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models

Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle