Towards Temporally Explainable Dysarthric Speech Clarity Assessment
By: Seohyun Park , Chitralekha Gupta , Michelle Kah Yian Kwan and more
Potential Business Impact:
Helps people with speech problems practice speaking better.
Dysarthria, a motor speech disorder, affects intelligibility and requires targeted interventions for effective communication. In this work, we investigate automated mispronunciation feedback by collecting a dysarthric speech dataset from six speakers reading two passages, annotated by a speech therapist with temporal markers and mispronunciation descriptions. We design a three-stage framework for explainable mispronunciation evaluation: (1) overall clarity scoring, (2) mispronunciation localization, and (3) mispronunciation type classification. We systematically analyze pretrained Automatic Speech Recognition (ASR) models in each stage, assessing their effectiveness in dysarthric speech evaluation (Code available at: https://github.com/augmented-human-lab/interspeech25_speechtherapy, Supplementary webpage: https://apps.ahlab.org/interspeech25_speechtherapy/). Our findings offer clinically relevant insights for automating actionable feedback for pronunciation assessment, which could enable independent practice for patients and help therapists deliver more effective interventions.
Similar Papers
A Multilingual Framework for Dysarthria: Detection, Severity Classification, Speech-to-Text, and Clean Speech Generation
Audio and Speech Processing
Helps people with speech problems talk clearly.
Exploring Generative Error Correction for Dysarthric Speech Recognition
Computation and Language
Helps computers understand speech from people with speech problems.
Applications of Artificial Intelligence for Cross-language Intelligibility Assessment of Dysarthric Speech
Computation and Language
Helps people with speech problems speak any language.