From Minutes to Days: Scaling Intracranial Speech Decoding with Supervised Pretraining
By: Linnea Evanson , Mingfang , Zhang and more
Potential Business Impact:
Lets computers understand speech from brain signals.
Decoding speech from brain activity has typically relied on limited neural recordings collected during short and highly controlled experiments. Here, we introduce a framework to leverage week-long intracranial and audio recordings from patients undergoing clinical monitoring, effectively increasing the training dataset size by over two orders of magnitude. With this pretraining, our contrastive learning model substantially outperforms models trained solely on classic experimental data, with gains that scale log-linearly with dataset size. Analysis of the learned representations reveals that, while brain activity represents speech features, its global structure largely drifts across days, highlighting the need for models that explicitly account for cross-day variability. Overall, our approach opens a scalable path toward decoding and modeling brain representations in both real-life and controlled task settings.
Similar Papers
Neural Decoding of Overt Speech from ECoG Using Vision Transformers and Contrastive Representation Learning
Artificial Intelligence
Lets paralyzed people talk by reading brain signals.
BaRISTA: Brain Scale Informed Spatiotemporal Representation of Human Intracranial Neural Activity
Machine Learning (CS)
Helps computers understand brain signals better.
A Penny for Your Thoughts: Decoding Speech from Inexpensive Brain Signals
Sound
Reads thoughts to make speech from brain waves.