Score: 0

Towards Unified Video Quality Assessment

Published: December 1, 2025 | arXiv ID: 2512.02224v1

By: Chen Feng , Tianhao Peng , Fan Zhang and more

Potential Business Impact:

Tells you why videos look bad.

Business Areas:

Image Recognition Data and Analytics, Software

Recent works in video quality assessment (VQA) typically employ monolithic models that typically predict a single quality score for each test video. These approaches cannot provide diagnostic, interpretable feedback, offering little insight into why the video quality is degraded. Most of them are also specialized, format-specific metrics rather than truly ``generic" solutions, as they are designed to learn a compromised representation from disparate perceptual domains. To address these limitations, this paper proposes Unified-VQA, a framework that provides a single, unified quality model applicable to various distortion types within multiple video formats by recasting generic VQA as a Diagnostic Mixture-of-Experts (MoE) problem. Unified-VQA employs multiple ``perceptual experts'' dedicated to distinct perceptual domains. A novel multi-proxy expert training strategy is designed to optimize each expert using a ranking-inspired loss, guided by the most suitable proxy metric for its domain. We also integrated a diagnostic multi-task head into this framework to generate a global quality score and an interpretable multi-dimensional artifact vector, which is optimized using a weakly-supervised learning strategy, leveraging the known properties of the large-scale training database generated for this work. With static model parameters (without retraining or fine-tuning), Unified-VQA demonstrates consistent and superior performance compared to over 18 benchmark methods for both generic VQA and diagnostic artifact detection tasks across 17 databases containing diverse streaming artifacts in HD, UHD, HDR and HFR formats. This work represents an important step towards practical, actionable, and interpretable video quality assessment.

Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models

CV and Pattern Recognition

Helps computers see clearly in bad pictures.

27 Aug 2025 2

90%

Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision

CV and Pattern Recognition

Makes videos look better without human help.

6 May 2025 1

90%

Q-CLIP: Unleashing the Power of Vision-Language Models for Video Quality Assessment through Unified Cross-Modal Adaptation

CV and Pattern Recognition

Makes computers judge video quality better, faster.

8 Aug 2025 0

View PDF Login to Bookmark

Country of Origin

🇬🇧 United Kingdom

Page Count

15 pages

Towards Unified Video Quality Assessment

Tells you why videos look bad.

Technical Abstract

Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models

Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision

Q-CLIP: Unleashing the Power of Vision-Language Models for Video Quality Assessment through Unified Cross-Modal Adaptation