Measuring Robustness of Speech Recognition from MEG Signals Under Distribution Shift
Sheng-You Chien, Bo-Yi Mao, Yi-Ning Chang, Po-Chih Kuo

TL;DR
This paper evaluates various neural network models and preprocessing strategies for robust phoneme decoding from MEG signals, highlighting the importance of normalization and distribution shift handling.
Contribution
It demonstrates that data normalization strategies significantly impact model generalization, and introduces MEGConformer as a more robust architecture for phoneme classification.
Findings
Instance normalization greatly improves generalization.
Most models degrade under distribution shift without normalization.
MEGConformer maintains stable performance across splits.
Abstract
This study investigates robust speech-related decoding from non-invasive MEG signals using the LibriBrain phoneme-classification benchmark from the 2025 PNPL competition. We compare residual convolutional neural networks (CNNs), an STFT-based CNN, and a CNN--Transformer hybrid, while also examining the effects of group averaging, label balancing, repeated grouping, normalization strategies, and data augmentation. Across our in-house implementations, preprocessing and data-configuration choices matter more than additional architectural complexity, among which instance normalization emerges as the most influential modification for generalization. The strongest of our own models, a CNN with group averaging, label balancing, repeated grouping, and instance normalization, achieves 60.95% F1-macro on the test split, compared with 39.53% for the plain CNN baseline. However, most of our models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
