MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models
Huanqia Cai, Yijun Yang, Winston Hu

TL;DR
This paper introduces MM-IQ, a benchmark for evaluating human-like abstraction and reasoning in multimodal models, revealing current models' significant limitations and proposing a new baseline trained with reinforcement learning.
Contribution
The paper presents MM-IQ, a large-scale multimodal reasoning benchmark, and provides a baseline model trained with reinforcement learning to advance AI reasoning capabilities.
Findings
Current models perform only slightly better than chance on MM-IQ.
Existing models show significant gaps in human-like reasoning abilities.
A new reinforcement learning-based baseline achieves competitive performance with smaller size.
Abstract
IQ testing has served as a foundational methodology for evaluating human cognitive capabilities, deliberately decoupling assessment from linguistic background, language proficiency, or domain-specific knowledge to isolate core competencies in abstraction and reasoning. Yet, artificial intelligence research currently lacks systematic benchmarks to quantify these critical cognitive capabilities in multimodal systems. To address this crucial gap, we propose MM-IQ, a comprehensive evaluation framework, which comprises a large-scale training set with 4,776 visual reasoning problems and 2,710 meticulously curated test items spanning 8 distinct reasoning paradigms. Through systematic evaluation of existing open-source and proprietary multimodal models, our benchmark reveals striking limitations: even state-of-the-art architectures achieve only marginally superior performance to random chance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Language, Metaphor, and Cognition
