Score: 0

From Silent Signals to Natural Language: A Dual-Stage Transformer-LLM Approach

Published: September 2, 2025 | arXiv ID: 2509.04507v1

By: Nithyashree Sivasubramaniam

Potential Business Impact:

Lets computers understand silent talking better.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Silent Speech Interfaces (SSIs) have gained attention for their ability to generate intelligible speech from non-acoustic signals. While significant progress has been made in advancing speech generation pipelines, limited work has addressed the recognition and downstream processing of synthesized speech, which often suffers from phonetic ambiguity and noise. To overcome these challenges, we propose an enhanced automatic speech recognition framework that combines a transformer-based acoustic model with a large language model (LLM) for post-processing. The transformer captures full utterance context, while the LLM ensures linguistic consistency. Experimental results show a 16% relative and 6% absolute reduction in word error rate (WER) over a 36% baseline, demonstrating substantial improvements in intelligibility for silent speech interfaces.

Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs

Computation and Language

New AI better at translating spoken words.

18 Dec 2025 1

89%

SpeechLLM: Unified Speech and Language Model for Enhanced Multi-Task Understanding in Low Resource Settings

Computation and Language

Lets computers understand spoken words for tasks.

29 Aug 2025 0

89%

MultiStream-LLM: Bridging Modalities for Robust Sign Language Translation

Computation and Language

Translates sign language better by using special parts.

20 Aug 2025 1

View PDF Login to Bookmark

Page Count

5 pages

From Silent Signals to Natural Language: A Dual-Stage Transformer-LLM Approach

Lets computers understand silent talking better.

Technical Abstract

Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs

SpeechLLM: Unified Speech and Language Model for Enhanced Multi-Task Understanding in Low Resource Settings

MultiStream-LLM: Bridging Modalities for Robust Sign Language Translation