Decoding Speech Envelopes from Electroencephalogram with a Contrastive Pearson Correlation Coefficient Loss
Yayun Liang, Yuanming Zhang, Fei Chen, Jing Lu, Zhibin Lin

TL;DR
This paper introduces a contrastive Pearson correlation coefficient loss for EEG-based speech envelope reconstruction, enhancing auditory attention decoding accuracy by explicitly maximizing the difference between attended and unattended PCCs.
Contribution
It proposes a novel contrastive PCC loss function that improves speech envelope decoding from EEG signals, outperforming traditional methods focused solely on attended PCC maximization.
Findings
Improves envelope separability and AAD accuracy across datasets.
Reveals dataset- and architecture-dependent failure cases.
Enhances DNN-based speech envelope reconstruction performance.
Abstract
Recent advances in reconstructing speech envelopes from Electroencephalogram (EEG) signals have enabled continuous auditory attention decoding (AAD) in multi-speaker environments. Most Deep Neural Network (DNN)-based envelope reconstruction models are trained to maximize the Pearson correlation coefficients (PCC) between the attended envelope and the reconstructed envelope (attended PCC). While the difference between the attended PCC and the unattended PCC plays an essential role in auditory attention decoding, existing methods often focus on maximizing the attended PCC. We therefore propose a contrastive PCC loss which represents the difference between the attended PCC and the unattended PCC. The proposed approach is evaluated on three public EEG AAD datasets using four DNN architectures. Across many settings, the proposed objective improves envelope separability and AAD accuracy,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Speech and Audio Processing · Speech Recognition and Synthesis
