GENIUS: Generative Fluid Intelligence Evaluation Suite
Ruichuan An, Sihan Yang, Ziyu Guo, Wei Dai, Zijun Shen, Haodong Li, Renrui Zhang, Xinyu Wei, Guopeng Li, Wenshan Wu, Wentao Zhang

TL;DR
GENIUS introduces a new benchmark suite to evaluate Generative Fluid Intelligence in multimodal models, focusing on their ability to induce patterns, execute constraints, and adapt to new contexts, revealing current limitations and proposing diagnostic strategies.
Contribution
This paper formalizes GFI as three primitives, creates the GENIUS benchmark for assessment, and provides diagnostic insights and interventions to improve models' dynamic reasoning capabilities.
Findings
Models show significant deficits in GFI tasks.
Performance issues are due to limited context understanding, not generative ability.
Proposed attention intervention improves model reasoning without additional training.
Abstract
Unified Multimodal Models (UMMs) have shown remarkable progress in visual generation. Yet, existing benchmarks predominantly assess , which relies on recalling accumulated knowledge and learned schemas. This focus overlooks : the capacity to induce patterns, reason through constraints, and adapt to novel scenarios on the fly. To rigorously assess this capability, we introduce ( Fluid ntelligence Evalation uite). We formalize as a synthesis of three primitives. These include (e.g., inferring personalized visual preferences), (e.g., visualizing abstract metaphors), and (e.g., simulating counter-intuitive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games
