Intelligibility of Text-to-Speech Systems for Mathematical Expressions
By: Sujoy Roychowdhury , H. G. Ranjani , Sumit Soman and more
Potential Business Impact:
Makes computers read math out loud better.
There has been limited evaluation of advanced Text-to-Speech (TTS) models with Mathematical eXpressions (MX) as inputs. In this work, we design experiments to evaluate quality and intelligibility of five TTS models through listening and transcribing tests for various categories of MX. We use two Large Language Models (LLMs) to generate English pronunciation from LaTeX MX as TTS models cannot process LaTeX directly. We use Mean Opinion Score from user ratings and quantify intelligibility through transcription correctness using three metrics. We also compare listener preference of TTS outputs with respect to human expert rendition of same MX. Results establish that output of TTS models for MX is not necessarily intelligible, the gap in intelligibility varies across TTS models and MX category. For most categories, performance of TTS models is significantly worse than that of expert rendition. The effect of choice of LLM is limited. This establishes the need to improve TTS models for MX.
Similar Papers
Speech-to-LaTeX: New Models and Datasets for Converting Spoken Equations and Sentences
CV and Pattern Recognition
Lets computers write math from spoken words.
Beyond Final Answers: Evaluating Large Language Models for Math Tutoring
Human-Computer Interaction
Helps computers teach math, but they make mistakes.
Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback
Computation and Language
Teaches math better in any language.