Exploring The Visual Feature Space for Multimodal Neural Decoding

Weihao Xia; Cengiz Oztireli

arXiv:2505.15755·cs.CV·May 22, 2025

Exploring The Visual Feature Space for Multimodal Neural Decoding

Weihao Xia, Cengiz Oztireli

PDF

Open Access

TL;DR

This paper introduces a zero-shot multimodal brain decoding method that leverages pre-trained visual features within Multimodal Large Language Models to improve the granularity and accuracy of neural decoding of visual information.

Contribution

It analyzes different visual feature spaces and proposes a new benchmark for evaluating fine-grained neural decoding across multiple levels of detail.

Findings

01

Enhanced decoding precision with multimodal models

02

Effective zero-shot decoding of detailed visual descriptions

03

Introduction of the MG-BrainDub benchmark for evaluation

Abstract

The intrication of brain signals drives research that leverages multimodal AI to align brain modalities with visual and textual data for explainable descriptions. However, most existing studies are limited to coarse interpretations, lacking essential details on object descriptions, locations, attributes, and their relationships. This leads to imprecise and ambiguous reconstructions when using such cues for visual decoding. To address this, we analyze different choices of vision feature spaces from pre-trained visual components within Multimodal Large Language Models (MLLMs) and introduce a zero-shot multimodal brain decoding method that interacts with these models to decode across multiple levels of granularities. % To assess a model's ability to decode fine details from brain signals, we propose the Multi-Granularity Brain Detail Understanding Benchmark (MG-BrainDub). This benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Digital Media Forensic Detection

MethodsALIGN