Independent Feature Enhanced Crossmodal Fusion for Match-Mismatch   Classification of Speech Stimulus and EEG Response

Shitong Fan; Wenbo Wang; Feiyang Xiao; Shiheng Zhang; Qiaoxi Zhu; Jian; Guan

arXiv:2410.15078·eess.AS·October 22, 2024·ISCSLP

Independent Feature Enhanced Crossmodal Fusion for Match-Mismatch Classification of Speech Stimulus and EEG Response

Shitong Fan, Wenbo Wang, Feiyang Xiao, Shiheng Zhang, Qiaoxi Zhu, Jian, Guan

PDF

Open Access

TL;DR

This paper introduces IFE-CF, a novel crossmodal fusion model that improves match-mismatch classification of speech stimuli and EEG responses by leveraging a crossmodal encoder, fusion modules, and causal masking.

Contribution

The proposed IFE-CF model innovatively combines crossmodal attention, feature fusion, and causal masking to enhance auditory EEG decoding accuracy.

Findings

01

Achieves higher classification accuracy than baseline methods.

02

Effectively models the relationship between speech and EEG signals.

03

Demonstrates robustness in auditory attention decoding.

Abstract

It is crucial for auditory attention decoding to classify matched and mismatched speech stimuli with corresponding EEG responses by exploring their relationship. However, existing methods often adopt two independent networks to encode speech stimulus and EEG response, which neglect the relationship between these signals from the two modalities. In this paper, we propose an independent feature enhanced crossmodal fusion model (IFE-CF) for match-mismatch classification, which leverages the fusion feature of the speech stimulus and the EEG response to achieve auditory EEG decoding. Specifically, our IFE-CF contains a crossmodal encoder to encode the speech stimulus and the EEG response with a two-branch structure connected via crossmodal attention mechanism in the encoding process, a multi-channel fusion module to fuse features of two modalities by aggregating the interaction feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Emotion and Mood Recognition

MethodsSoftmax · Attention Is All You Need