Evaluating Cognitive Age Alignment in Interactive AI Agents

Yifan Shen; Jiawen Zhang; Jian Xu; Junho Kim; Ismini Lourentzou; Xu Cao; Meihuan Huang

arXiv:2605.17894·cs.AI·May 19, 2026

Evaluating Cognitive Age Alignment in Interactive AI Agents

Yifan Shen, Jiawen Zhang, Jian Xu, Junho Kim, Ismini Lourentzou, Xu Cao, Meihuan Huang

PDF

1 Repo

TL;DR

This paper introduces ChildAgentEval, a new benchmark inspired by child development tests, to assess how well AI agents mimic human cognitive abilities at different ages.

Contribution

It presents the first psychometrically grounded benchmark for evaluating cognitive age alignment in multimodal large language model-based AI agents.

Findings

01

Reveals gaps in AI agents' ability to simulate age-specific cognition

02

Provides a systematic comparison of AI reasoning with human developmental stages

03

Highlights areas where AI agents need improvement to match human cognitive levels

Abstract

While agentic AI and its core multimodal large language models (MLLMs) have demonstrated remarkable promise in language and visual reasoning across domains ranging from daily life to advanced scientific research, a profound gap remains between artificial and human intelligence. Despite the integration of powerful tools and advanced MLLMs, state-of-the-art AI agents frequently fail at foundational, seemingly simple tasks that a child can resolve with ease. Inspired by the Wechsler Intelligence Scale for Children (WISC), we introduce ChildAgentEval, the first psychometrically grounded interactive benchmark for evaluating cognitive age alignment in MLLM-based agents. ChildAgentEval systematically compares the reasoning performance of various MLLM-based interactive agents against age-specific human developmental stages, exposing where current agentic AI systems can and cannot simulate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pediamedai/ChildAgentEval
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.