Score: 0

DeepGESI: A Non-Intrusive Objective Evaluation Model for Predicting Speech Intelligibility in Hearing-Impaired Listeners

Published: December 22, 2025 | arXiv ID: 2512.19374v1

By: Wenyu Luo, Jinhui Chen

Speech intelligibility assessment is essential for many speech-related applications. However, most objective intelligibility metrics are intrusive, as they require clean reference speech in addition to the degraded or processed signal for evaluation. Furthermore, existing metrics such as STOI are primarily designed for normal hearing listeners, and their predictive accuracy for hearing impaired speech intelligibility remains limited. On the other hand, the GESI (Gammachirp Envelope Similarity Index) can be used to estimate intelligibility for hearing-impaired listeners, but it is also intrusive, as it depends on reference signals. This requirement limits its applicability in real-world scenarios. To overcome this limitation, this study proposes DeepGESI, a non-intrusive deep learning-based model capable of accurately and efficiently predicting the speech intelligibility of hearing-impaired listeners without requiring any clean reference speech. Experimental results demonstrate that, under the test conditions of the 2nd Clarity Prediction Challenge(CPC2) dataset, the GESI scores predicted by DeepGESI exhibit a strong correlation with the actual GESI scores. In addition, the proposed model achieves a substantially faster prediction speed compared to conventional methods.

Category
Computer Science:
Sound