SonicSieve: Bringing Directional Speech Extraction to Smartphones Using Acoustic Microstructures
By: Kuang Yuan , Yifeng Wang , Xiyuxing Zhang and more
Potential Business Impact:
Focuses phone mic on one voice in noise.
Imagine placing your smartphone on a table in a noisy restaurant and clearly capturing the voices of friends seated around you, or recording a lecturer's voice with clarity in a reverberant auditorium. We introduce SonicSieve, the first intelligent directional speech extraction system for smartphones using a bio-inspired acoustic microstructure. Our passive design embeds directional cues onto incoming speech without any additional electronics. It attaches to the in-line mic of low-cost wired earphones which can be attached to smartphones. We present an end-to-end neural network that processes the raw audio mixtures in real-time on mobile devices. Our results show that SonicSieve achieves a signal quality improvement of 5.0 dB when focusing on a 30{\deg} angular region. Additionally, the performance of our system based on only two microphones exceeds that of conventional 5-microphone arrays.
Similar Papers
VibOmni: Towards Scalable Bone-conduction Speech Enhancement on Earables
Sound
Lets earbuds hear you clearly in loud places.
Spatial Audio Processing with Large Language Model on Wearable Devices
Sound
Listens to where sounds come from.
Single-Microphone-Based Sound Source Localization for Mobile Robots in Reverberant Environments
Robotics
Robot hears where sounds come from with one ear.