Score: 0

Contract-Driven QoE Auditing for Speech and Singing Services: From MOS Regression to Service Graphs

Published: December 4, 2025 | arXiv ID: 2512.04827v1

By: Wenzhang Du

Potential Business Impact:

Tests sound quality better than old ways.

Business Areas:

Semantic Search Internet Services

Subjective mean opinion scores (MOS) remain the de-facto target for non-intrusive speech and singing quality assessment. However, MOS is a scalar that collapses heterogeneous user expectations, ignores service-level objectives, and is difficult to compare across deployment graphs. We propose a contract-driven QoE auditing framework: each service graph G is evaluated under a set of human-interpretable experience contracts C, yielding a contract-level satisfaction vector Q(G, C). We show that (i) classical MOS regression is a special case with a degenerate contract set, (ii) contract-driven quality is more stable than MOS under graph view transformations (e.g., pooling by system vs. by system type), and (iii) the effective sample complexity of learning contracts is governed by contract semantics rather than merely the dimensionality of C. We instantiate the framework on URGENT2024 MOS (6.9k speech utterances with raw rating vectors) and SingMOS v1 (7,981 singing clips; 80 systems). On URGENT, we train a contract-aware neural auditor on self-supervised WavLM embeddings; on SingMOS, we perform contract-driven graph auditing using released rating vectors and metadata without decoding audio. Empirically, our auditor matches strong MOS predictors in MOS accuracy while providing calibrated contract probabilities; on SingMOS, Q(G, C) exhibits substantially smaller cross-view drift than raw MOS and graph-only baselines; on URGENT, difficulty curves reveal that mis-specified "simple" contracts can be harder to learn than richer but better aligned contract sets.

From Scores to Preferences: Redefining MOS Benchmarking for Speech Quality Reward Modeling

Sound

Makes computer voices sound more real.

1 Oct 2025 0

88%

SingMOS-Pro: An Comprehensive Benchmark for Singing Quality Assessment

Sound

Helps computers judge how good a fake singing voice sounds.

2 Oct 2025 1

88%

Bridging Subjective and Objective QoE: Operator-Level Aggregation Using LLM-Based Comment Analysis and Network MOS Comparison

Networking and Internet Architecture

Helps internet companies know if videos are good.

1 Jun 2025 0

View PDF Login to Bookmark

Page Count

10 pages

Contract-Driven QoE Auditing for Speech and Singing Services: From MOS Regression to Service Graphs

Tests sound quality better than old ways.

Technical Abstract

From Scores to Preferences: Redefining MOS Benchmarking for Speech Quality Reward Modeling

SingMOS-Pro: An Comprehensive Benchmark for Singing Quality Assessment

Bridging Subjective and Objective QoE: Operator-Level Aggregation Using LLM-Based Comment Analysis and Network MOS Comparison