Unveiling Deep Semantic Uncertainty Perception for Language-Anchored Multi-modal Vision-Brain Alignment
Zehui Feng, Chenqi Zhang, Mingru Wang, Minuo Wei, Shiwei Cheng, Cuntai Guan, Ting Han

TL;DR
This paper introduces Bratrix, an innovative end-to-end framework that aligns visual stimuli, neural signals, and language representations into a shared space, improving interpretability and robustness in neural-visual-linguistic tasks.
Contribution
Bratrix is the first framework to decouple visual and linguistic semantics for multimodal brain alignment, incorporating uncertainty modeling and a two-stage training strategy for enhanced performance.
Findings
Outperforms state-of-the-art in EEG, MEG, and fMRI tasks.
Surpasses 14.3% improvement in EEG retrieval accuracy.
Enhances neural-visual-linguistic alignment and interpretability.
Abstract
Unveiling visual semantics from neural signals such as EEG, MEG, and fMRI remains a fundamental challenge due to subject variability and the entangled nature of visual features. Existing approaches primarily align neural activity directly with visual embeddings, but visual-only representations often fail to capture latent semantic dimensions, limiting interpretability and deep robustness. To address these limitations, we propose Bratrix, the first end-to-end framework to achieve multimodal Language-Anchored Vision-Brain alignment. Bratrix decouples visual stimuli into hierarchical visual and linguistic semantic components, and projects both visual and brain representations into a shared latent space, enabling the formation of aligned visual-language and brain-language embeddings. To emulate human-like perceptual reliability and handle noisy neural signals, Bratrix incorporates a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Face Recognition and Perception · EEG and Brain-Computer Interfaces
