Score: 0

An Effective Strategy for Modeling Score Ordinality and Non-uniform Intervals in Automated Speaking Assessment

Published: August 27, 2025 | arXiv ID: 2509.03372v1

By: Tien-Hong Lo , Szu-Yu Chen , Yao-Ting Sung and more

Potential Business Impact:

Helps computers judge how well people speak English.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

A recent line of research on automated speaking assessment (ASA) has benefited from self-supervised learning (SSL) representations, which capture rich acoustic and linguistic patterns in non-native speech without underlying assumptions of feature curation. However, speech-based SSL models capture acoustic-related traits but overlook linguistic content, while text-based SSL models rely on ASR output and fail to encode prosodic nuances. Moreover, most prior arts treat proficiency levels as nominal classes, ignoring their ordinal structure and non-uniform intervals between proficiency labels. To address these limitations, we propose an effective ASA approach combining SSL with handcrafted indicator features via a novel modeling paradigm. We further introduce a multi-margin ordinal loss that jointly models both the score ordinality and non-uniform intervals of proficiency labels. Extensive experiments on the TEEMI corpus show that our method consistently outperforms strong baselines and generalizes well to unseen prompts.

Page Count
7 pages

Category
Electrical Engineering and Systems Science:
Audio and Speech Processing