SynMind: Reducing Semantic Hallucination in fMRI-Based Image Reconstruction
Lan Yang, Minghan Yang, Ke Li, Honggang Zhang, Kaiyue Pang, Yi-Zhe Song

TL;DR
SynMind introduces a semantic-aware framework for fMRI-based image reconstruction, significantly reducing hallucinations by integrating explicit semantic descriptions with visual priors, leading to more accurate and human-aligned images.
Contribution
The paper proposes SynMind, a novel approach that incorporates sentence-level semantic parsing and grounded VLMs to improve the semantic fidelity of fMRI-based image reconstructions.
Findings
Outperforms state-of-the-art methods on multiple metrics
Produces reconstructions more aligned with human perception
Engages broader, more relevant brain regions in neurovisualization
Abstract
Recent advances in fMRI-based image reconstruction have achieved remarkable photo-realistic fidelity. Yet, a persistent limitation remains: while reconstructed images often appear naturalistic and holistically similar to the target stimuli, they frequently suffer from severe semantic misalignment -- salient objects are often replaced or hallucinated despite high visual quality. In this work, we address this limitation by rethinking the role of explicit semantic interpretation in fMRI decoding. We argue that existing methods rely too heavily on entangled visual embeddings which prioritize low-level appearance cues -- such as texture and global gist -- over explicit semantic identity. To overcome this, we parse fMRI signals into rich, sentence-level semantic descriptions that mirror the hierarchical and compositional nature of human visual understanding. We achieve this by leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Generative Adversarial Networks and Image Synthesis · Aesthetic Perception and Analysis
