Enhancing Automatic Speech Recognition Through Integrated Noise Detection Architecture
By: Karamvir Singh
Potential Business Impact:
Helps computers hear words better in noisy places.
This research presents a novel approach to enhancing automatic speech recognition systems by integrating noise detection capabilities directly into the recognition architecture. Building upon the wav2vec2 framework, the proposed method incorporates a dedicated noise identification module that operates concurrently with speech transcription. Experimental validation using publicly available speech and environmental audio datasets demonstrates substantial improvements in transcription quality and noise discrimination. The enhanced system achieves superior performance in word error rate, character error rate, and noise detection accuracy compared to conventional architectures. Results indicate that joint optimization of transcription and noise classification objectives yields more reliable speech recognition in challenging acoustic conditions.
Similar Papers
Real-Time Speech Enhancement via a Hybrid ViT: A Dual-Input Acoustic-Image Feature Fusion
Sound
Cleans up noisy sounds so you can hear speech better.
Visual-Aware Speech Recognition for Noisy Scenarios
Computation and Language
Helps computers hear speech in noisy places.
When De-noising Hurts: A Systematic Study of Speech Enhancement Effects on Modern Medical ASR Systems
Sound
Makes computers understand messy talking better.