MM-OPERA: Benchmarking Open-ended Association Reasoning for Large Vision-Language Models
Zimeng Huang, Jinxin Ke, Xiaoxuan Fan, Yufeng Yang, Yang Liu, Liu Zhonghan, Zedi Wang, Junteng Dai, Haoyi Jiang, Yuyu Zhou, Keze Wang, Ziliang Chen

TL;DR
This paper introduces MM-OPERA, a comprehensive benchmark for evaluating open-ended association reasoning in large vision-language models, highlighting current limitations and guiding future improvements.
Contribution
It presents a new benchmark with 11,497 instances for open-ended association tasks, aligned with human psychometric principles, and introduces tailored evaluation strategies for LVLMs.
Findings
Current LVLMs show limitations in associative reasoning.
The benchmark reveals gaps in divergent and convergent thinking abilities.
Evaluation strategies improve understanding of model reasoning processes.
Abstract
Large Vision-Language Models (LVLMs) have exhibited remarkable progress. However, deficiencies remain compared to human intelligence, such as hallucination and shallow pattern matching. In this work, we aim to evaluate a fundamental yet underexplored intelligence: association, a cornerstone of human cognition for creative thinking and knowledge integration. Current benchmarks, often limited to closed-ended tasks, fail to capture the complexity of open-ended association reasoning vital for real-world applications. To address this, we present MM-OPERA, a systematic benchmark with 11,497 instances across two open-ended tasks: Remote-Item Association (RIA) and In-Context Association (ICA), aligning association intelligence evaluation with human psychometric principles. It challenges LVLMs to resemble the spirit of divergent thinking and convergent associative reasoning through free-form…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Neurobiology of Language and Bilingualism · Face Recognition and Perception
