BrainWavLM: Fine-tuning Speech Representations with Brain Responses to Language
Nishitha Vattikonda, Aditya R. Vaidya, Richard J. Antonello, Alexander, G. Huth

TL;DR
This paper introduces BrainWavLM, a fine-tuned speech encoding model that leverages brain responses to improve the prediction of human brain activity during speech perception, demonstrating enhanced robustness and semantic representation.
Contribution
The study presents BrainWavLM, a novel end-to-end fine-tuning approach using LoRA on WavLM, improving speech encoding by incorporating brain response data and enhancing semantic representations.
Findings
Fine-tuning across cortex improves encoding performance and stability.
Selective fine-tuning in auditory cortex enhances local performance.
Models generalize across subjects, capturing robust brain-like speech representations.
Abstract
Speech encoding models use auditory representations to predict how the human brain responds to spoken language stimuli. Most performant encoding models linearly map the hidden states of artificial neural networks to brain data, but this linear restriction may limit their effectiveness. In this work, we use low-rank adaptation (LoRA) to fine-tune a WavLM-based encoding model end-to-end on a brain encoding objective, producing a model we name BrainWavLM. We show that fine-tuning across all of cortex improves average encoding performance with greater stability than without LoRA. This improvement comes at the expense of low-level regions like auditory cortex (AC), but selectively fine-tuning on these areas improves performance in AC, while largely retaining gains made in the rest of cortex. Fine-tuned models generalized across subjects, indicating that they learned robust brain-like…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Topic Modeling · Neurobiology of Language and Bilingualism
