EmoMind: Decoding Affective Captions from Human Brain fMRI
Bilal A. Mohammed, Lin Gu, Ruogo Fang

TL;DR
EmoMind is an innovative end-to-end system that decodes affective captions directly from fMRI signals, integrating semantic scene descriptions with continuous emotion vectors for personalized affective language generation.
Contribution
It introduces the first pipeline to decode affective captions from brain activity, combining semantic and emotional decoding with a controllable rewriting process.
Findings
EmoMind outperforms GPT-4 baseline on subject-specific affective caption metrics.
The system enables smooth interpolation between semantic fidelity and affective expressivity.
It demonstrates robustness to measurement apparatus variations through synthetic-brain substitution tests.
Abstract
Decoding visual experience from brain activity has advanced substantially, but cur- rent brain-to-text systems largely recover semantic content while discarding affect. Additionally, language models can generate emotional text when prompted with categorical labels, but such labels collapse rich inter-subject variability into coarse discrete bins. We present EmoMind, the first end-to-end pipeline for decoding affective captions directly from fMRI signals. EmoMind first retrieves a semanti- cally grounded neutral scene description from brain-decoded visual features, then rewrites it using a continuous 34-dimensional emotion vector decoded from the same fMRI recording. To control the balance between content preservation and affective expression, we train the rewriter with classifier-free guidance against an identity-preserving null branch, enabling smooth interpolation between semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
