Deploying UDM Series in Real-Life Stuttered Speech Applications: A Clinical Evaluation Framework
By: Eric Zhang , Li Wei , Sarah Chen and more
Potential Business Impact:
Helps doctors diagnose speech problems faster.
Stuttered and dysfluent speech detection systems have traditionally suffered from the trade-off between accuracy and clinical interpretability. While end-to-end deep learning models achieve high performance, their black-box nature limits clinical adoption. This paper looks at the Unconstrained Dysfluency Modeling (UDM) series-the current state-of-the-art framework developed by Berkeley that combines modular architecture, explicit phoneme alignment, and interpretable outputs for real-world clinical deployment. Through extensive experiments involving patients and certified speech-language pathologists (SLPs), we demonstrate that UDM achieves state-of-the-art performance (F1: 0.89+-0.04) while providing clinically meaningful interpretability scores (4.2/5.0). Our deployment study shows 87% clinician acceptance rate and 34% reduction in diagnostic time. The results provide strong evidence that UDM represents a practical pathway toward AI-assisted speech therapy in clinical environments.
Similar Papers
A Comparative Study of Controllability, Explainability, and Performance in Dysfluency Detection Models
Artificial Intelligence
Helps doctors understand speech problems better.
Leveraging LLM for Stuttering Speech: A Unified Architecture Bridging Recognition and Event Detection
Sound
Helps computers understand people who stutter better.
Revisiting Rule-Based Stuttering Detection: A Comprehensive Analysis of Interpretable Models for Clinical Applications
Artificial Intelligence
Helps doctors understand stuttering better.