Integration of cognitive tasks into artificial general intelligence test for large models
Youzhi Qu, Chen Wei, Penghui Du, Wenxin Che, Chi Zhang, Wanli Ouyang,, Yatao Bian, Feiyang Xu, Bin Hu, Kai Du, Haiyan Wu, Jia Liu, Quanying Liu

TL;DR
This paper proposes a comprehensive, cognitive science-inspired testing framework for large models to evaluate their multidimensional intelligence, aiming to improve assessment accuracy and guide targeted enhancements.
Contribution
It introduces a novel AGI testing framework encompassing multiple intelligence facets, integrating human-like cognitive tests into an immersive virtual environment.
Findings
A battery of cognitive tests from human intelligence assessments is adapted for large models.
The framework emphasizes increasing test complexity with model advancements.
Interpreting test results is crucial to avoid false positives and negatives.
Abstract
During the evolution of large models, performance evaluation is necessarily performed to assess their capabilities and ensure safety before practical application. However, current model evaluations mainly rely on specific tasks and datasets, lacking a united framework for assessing the multidimensional intelligence of large models. In this perspective, we advocate for a comprehensive framework of cognitive science-inspired artificial general intelligence (AGI) tests, aimed at fulfilling the testing needs of large models with enhanced capabilities. The cognitive science-inspired AGI tests encompass the full spectrum of intelligence facets, including crystallized intelligence, fluid intelligence, social intelligence, and embodied intelligence. To assess the multidimensional intelligence of large models, the AGI tests consist of a battery of well-designed cognitive tests adopted from human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning
