Score: 1

Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences

Published: October 27, 2025 | arXiv ID: 2510.23451v1

By: Zhuoran Jin , Hongbang Yuan , Kejian Zhu and more

Potential Business Impact:

AI learns what you like in any format.

Business Areas:

A/B Testing Data and Analytics

Reward models (RMs) play a critical role in aligning AI behaviors with human preferences, yet they face two fundamental challenges: (1) Modality Imbalance, where most RMs are mainly focused on text and image modalities, offering limited support for video, audio, and other modalities; and (2) Preference Rigidity, where training on fixed binary preference pairs fails to capture the complexity and diversity of personalized preferences. To address the above challenges, we propose Omni-Reward, a step toward generalist omni-modal reward modeling with support for free-form preferences, consisting of: (1) Evaluation: We introduce Omni-RewardBench, the first omni-modal RM benchmark with free-form preferences, covering nine tasks across five modalities including text, image, video, audio, and 3D; (2) Data: We construct Omni-RewardData, a multimodal preference dataset comprising 248K general preference pairs and 69K instruction-tuning pairs for training generalist omni-modal RMs; (3) Model: We propose Omni-RewardModel, which includes both discriminative and generative RMs, and achieves strong performance on Omni-RewardBench as well as other widely used reward modeling benchmarks.

OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning

Computation and Language

Helps AI judge long answers by checking facts.

28 Oct 2025 0

89%

OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning

Computation and Language

Helps AI judge long answers by checking facts.

28 Oct 2025 0

89%

VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding

CV and Pattern Recognition

Helps AI understand videos better.

30 Aug 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

48 pages

Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences

AI learns what you like in any format.

Technical Abstract

OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning

OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning

VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding