The Brain's Bitter Lesson: Scaling Speech Decoding With Self-Supervised Learning

Dulhan Jayalath; Gilad Landau; Brendan Shillingford; Mark Woolrich; Oiwi Parker Jones

arXiv:2406.04328·cs.LG·June 3, 2025

The Brain's Bitter Lesson: Scaling Speech Decoding With Self-Supervised Learning

Dulhan Jayalath, Gilad Landau, Brendan Shillingford, Mark Woolrich, Oiwi Parker Jones

PDF

Open Access 1 Video

TL;DR

This paper introduces a neuroscience-informed self-supervised learning approach that scales speech decoding from brain activity across diverse datasets and subjects, achieving significant improvements and generalization capabilities.

Contribution

It develops a novel architecture and objectives for learning from heterogeneous brain recordings, enabling scalable and generalizable speech decoding models.

Findings

01

Achieves 15-27% improvement over state-of-the-art models.

02

Generalizes across participants, datasets, and tasks.

03

Matches surgical decoding performance with non-invasive data.

Abstract

The past few years have seen remarkable progress in the decoding of speech from brain activity, primarily driven by large single-subject datasets. However, due to individual variation, such as anatomy, and differences in task design and scanning hardware, leveraging data across subjects and datasets remains challenging. In turn, the field has not benefited from the growing number of open neural data repositories to exploit large-scale deep learning. To address this, we develop neuroscience-informed self-supervised objectives, together with an architecture, for learning from heterogeneous brain recordings. Scaling to nearly 400 hours of MEG data and 900 subjects, our approach shows generalisation across participants, datasets, tasks, and even to novel subjects. It achieves improvements of 15-27% over state-of-the-art models and matches surgical decoding performance with non-invasive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The Brain's Bitter Lesson: Scaling Speech Decoding With Self-Supervised Learning· slideslive

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems

MethodsSparse Evolutionary Training