Measuring Robustness of Speech Recognition from MEG Signals Under Distribution Shift

Sheng-You Chien; Bo-Yi Mao; Yi-Ning Chang; Po-Chih Kuo

arXiv:2604.04129·cs.SD·April 7, 2026

Measuring Robustness of Speech Recognition from MEG Signals Under Distribution Shift

Sheng-You Chien, Bo-Yi Mao, Yi-Ning Chang, Po-Chih Kuo

PDF

TL;DR

This paper evaluates various neural network models and preprocessing strategies for robust phoneme decoding from MEG signals, highlighting the importance of normalization and distribution shift handling.

Contribution

It demonstrates that data normalization strategies significantly impact model generalization, and introduces MEGConformer as a more robust architecture for phoneme classification.

Findings

01

Instance normalization greatly improves generalization.

02

Most models degrade under distribution shift without normalization.

03

MEGConformer maintains stable performance across splits.

Abstract

This study investigates robust speech-related decoding from non-invasive MEG signals using the LibriBrain phoneme-classification benchmark from the 2025 PNPL competition. We compare residual convolutional neural networks (CNNs), an STFT-based CNN, and a CNN--Transformer hybrid, while also examining the effects of group averaging, label balancing, repeated grouping, normalization strategies, and data augmentation. Across our in-house implementations, preprocessing and data-configuration choices matter more than additional architectural complexity, among which instance normalization emerges as the most influential modification for generalization. The strongest of our own models, a CNN with group averaging, label balancing, repeated grouping, and instance normalization, achieves 60.95% F1-macro on the test split, compared with 39.53% for the plain CNN baseline. However, most of our models,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.