Score: 0

Kinship in Speech: Leveraging Linguistic Relatedness for Zero-Shot TTS in Indian Languages

Published: June 4, 2025 | arXiv ID: 2506.03884v1

By: Utkarsh Pathak , Chandra Sai Krishna Gunda , Anusha Prakash and more

Potential Business Impact:

Makes computers speak many new languages.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Text-to-speech (TTS) systems typically require high-quality studio data and accurate transcriptions for training. India has 1369 languages, with 22 official using 13 scripts. Training a TTS system for all these languages, most of which have no digital resources, seems a Herculean task. Our work focuses on zero-shot synthesis, particularly for languages whose scripts and phonotactics come from different families. The novelty of our work is in the augmentation of a shared phone representation and modifying the text parsing rules to match the phonotactics of the target language, thus reducing the synthesiser overhead and enabling rapid adaptation. Intelligible and natural speech was generated for Sanskrit, Maharashtrian and Canara Konkani, Maithili and Kurukh by leveraging linguistic connections across languages with suitable synthesisers. Evaluations confirm the effectiveness of this approach, highlighting its potential to expand speech technology access for under-represented languages.

Empowering Global Voices: A Data-Efficient, Phoneme-Tone Adaptive Approach to High-Fidelity Speech Synthesis

Sound

Makes computers speak any language, even rare ones.

10 Apr 2025 1

88%

Text to Speech System for Meitei Mayek Script

Computation and Language

Lets computers speak the Manipuri language.

9 Aug 2025 1

87%

End-to-End Speech Translation for Low-Resource Languages Using Weakly Labeled Data

Computation and Language

Translates speech for languages with little data.

19 Jun 2025 1

View PDF Login to Bookmark

Country of Origin

🇮🇳 India

Page Count

5 pages

Kinship in Speech: Leveraging Linguistic Relatedness for Zero-Shot TTS in Indian Languages

Makes computers speak many new languages.

Technical Abstract

Empowering Global Voices: A Data-Efficient, Phoneme-Tone Adaptive Approach to High-Fidelity Speech Synthesis

Text to Speech System for Meitei Mayek Script

End-to-End Speech Translation for Low-Resource Languages Using Weakly Labeled Data