Score: 0

Intelligibility of Text-to-Speech Systems for Mathematical Expressions

Published: June 5, 2025 | arXiv ID: 2506.11086v1

By: Sujoy Roychowdhury , H. G. Ranjani , Sumit Soman and more

Potential Business Impact:

Makes computers read math out loud better.

Business Areas:
Text Analytics Data and Analytics, Software

There has been limited evaluation of advanced Text-to-Speech (TTS) models with Mathematical eXpressions (MX) as inputs. In this work, we design experiments to evaluate quality and intelligibility of five TTS models through listening and transcribing tests for various categories of MX. We use two Large Language Models (LLMs) to generate English pronunciation from LaTeX MX as TTS models cannot process LaTeX directly. We use Mean Opinion Score from user ratings and quantify intelligibility through transcription correctness using three metrics. We also compare listener preference of TTS outputs with respect to human expert rendition of same MX. Results establish that output of TTS models for MX is not necessarily intelligible, the gap in intelligibility varies across TTS models and MX category. For most categories, performance of TTS models is significantly worse than that of expert rendition. The effect of choice of LLM is limited. This establishes the need to improve TTS models for MX.

Page Count
5 pages

Category
Electrical Engineering and Systems Science:
Audio and Speech Processing