MEG-XL: Data-Efficient Brain-to-Text via Long-Context Pre-Training
Dulhan Jayalath, Oiwi Parker Jones

TL;DR
MEG-XL introduces a long-context pre-training approach for brain-to-text interfaces, significantly improving data efficiency and transferability in neural decoding tasks.
Contribution
The paper presents MEG-XL, a model pre-trained with 2.5 minutes of neural data per sample, enabling better generalization and performance with less training data.
Findings
MEG-XL matches supervised performance with only 1 hour of data compared to 50 hours.
Pre-training with longer neural context improves transfer to word decoding.
Long-context pre-training exploits extended neural information often discarded by other methods.
Abstract
Clinical brain-to-text interfaces are designed for paralysed patients who cannot provide extensive training recordings. Pre-training improves data-efficient generalisation by learning statistical priors across subjects, but these priors critically depend on context. While natural speech might unfold gradually over minutes, most methods pre-train with only a few seconds of context. Thus, we propose MEG-XL, a model pre-trained with 2.5 minutes of MEG context per sample, 5-300x longer than prior work, and equivalent to 191k tokens, capturing extended neural context. Fine-tuning on the task of word decoding from brain data, MEG-XL matches supervised performance with a fraction of the data (e.g. 1hr vs 50hrs) and outperforms brain foundation models. We find that models pre-trained with longer contexts learn representations that transfer better to word decoding. Our results indicate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
