Score: 1

Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses

Published: September 17, 2025 | arXiv ID: 2509.14430v1

By: Yufeng Yang , Yiteng Huang , Yong Xu and more

BigTech Affiliations: Meta

Potential Business Impact:

Clears background noise for better voice commands.

Business Areas:
Speech Recognition Data and Analytics, Software

With the growing adoption of wearable devices such as smart glasses for AI assistants, wearer speech recognition (WSR) is becoming increasingly critical to next-generation human-computer interfaces. However, in real environments, interference from side-talk speech remains a significant challenge to WSR and may cause accumulated errors for downstream tasks such as natural language processing. In this work, we introduce a novel multi-channel differential automatic speech recognition (ASR) method for robust WSR on smart glasses. The proposed system takes differential inputs from different frontends that complement each other to improve the robustness of WSR, including a beamformer, microphone selection, and a lightweight side-talk detection model. Evaluations on both simulated and real datasets demonstrate that the proposed system outperforms the traditional approach, achieving up to an 18.0% relative reduction in word error rate.

Country of Origin
🇺🇸 United States

Page Count
5 pages

Category
Electrical Engineering and Systems Science:
Audio and Speech Processing