A Penny for Your Thoughts: Decoding Speech from Inexpensive Brain Signals
By: Quentin Auster , Kateryna Shapovalenko , Chuang Ma and more
Potential Business Impact:
Reads thoughts to make speech from brain waves.
We explore whether neural networks can decode brain activity into speech by mapping EEG recordings to audio representations. Using EEG data recorded as subjects listened to natural speech, we train a model with a contrastive CLIP loss to align EEG-derived embeddings with embeddings from a pre-trained transformer-based speech model. Building on the state-of-the-art EEG decoder from Meta, we introduce three architectural modifications: (i) subject-specific attention layers (+0.15% WER improvement), (ii) personalized spatial attention (+0.45%), and (iii) a dual-path RNN with attention (-1.87%). Two of the three modifications improved performance, highlighting the promise of personalized architectures for brain-to-speech decoding and applications in brain-computer interfaces.
Similar Papers
Neural Decoding of Overt Speech from ECoG Using Vision Transformers and Contrastive Representation Learning
Artificial Intelligence
Lets paralyzed people talk by reading brain signals.
CAT-Net: A Cross-Attention Tone Network for Cross-Subject EEG-EMG Fusion Tone Decoding
Sound
Lets people "talk" with their minds by reading brain and muscle signals.
Reconstructing Unseen Sentences from Speech-related Biosignals for Open-vocabulary Neural Communication
Human-Computer Interaction
Lets brains speak any new sentence.