DMF2Mel: A Dynamic Multiscale Fusion Network for EEG-Driven Mel Spectrogram Reconstruction
Cunhang Fan, Sheng Zhang, Jingjing Zhang, Enrui Liu, Xinhui Li, Gangming Zhao, Zhao Lv

TL;DR
This paper introduces DMF2Mel, a novel neural network architecture that improves the reconstruction of continuous imagined speech mel spectrograms from EEG signals by effectively modeling long-range dependencies and suppressing noise.
Contribution
The paper presents a new multiscale fusion network with innovative modules for EEG-based speech reconstruction, outperforming existing methods in accuracy and noise suppression.
Findings
Achieved 48% improvement in Pearson correlation for known subjects
Achieved 35% improvement for unknown subjects
Demonstrated effective long-range dependency modeling in EEG signals
Abstract
Decoding speech from brain signals is a challenging research problem. Although existing technologies have made progress in reconstructing the mel spectrograms of auditory stimuli at the word or letter level, there remain core challenges in the precise reconstruction of minute-level continuous imagined speech: traditional models struggle to balance the efficiency of temporal dependency modeling and information retention in long-sequence decoding. To address this issue, this paper proposes the Dynamic Multiscale Fusion Network (DMF2Mel), which consists of four core components: the Dynamic Contrastive Feature Aggregation Module (DC-FAM), the Hierarchical Attention-Guided Multi-Scale Network (HAMS-Net), the SplineMap attention mechanism, and the bidirectional state space module (convMamba). Specifically, the DC-FAM separates speech-related "foreground features" from noisy "background…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural dynamics and brain function · CCD and CMOS Imaging Sensors · Machine Learning in Materials Science
