KidsArtBench: Multi-Dimensional Children's Art Evaluation with Attribute-Aware MLLMs
Mingrui Ye, Chanjin Zheng, Zengyi Yu, Chenyu Xiang, Zhixue Zhao, Zheng Yuan, Helen Yannakoudakis

TL;DR
This paper introduces KidsArtBench, a comprehensive dataset of children's artwork with multi-dimensional expert annotations, and proposes an attribute-aware evaluation method for multimodal models to better assess artistic qualities in educational contexts.
Contribution
The paper presents KidsArtBench, a new multi-dimensional benchmark for children's art evaluation, and develops an attribute-specific multi-LoRA approach with RAFT for improved assessment accuracy.
Findings
Increased correlation from 0.468 to 0.653 on evaluation metrics.
Enhanced assessment of perceptual and higher-order attributes.
Established a new testbed for educational AI evaluation.
Abstract
Multimodal Large Language Models (MLLMs) show remarkable progress across many visual-language tasks; however, their capacity to evaluate artistic expression remains limited. Aesthetic concepts are inherently abstract and open-ended, and multimodal artwork annotations are scarce. We introduce KidsArtBench, a new benchmark of over 1k children's artworks (ages 5-15) annotated by 12 expert educators across 9 rubric-aligned dimensions, together with expert comments for feedback. Unlike prior aesthetic datasets that provide single scalar scores on adult imagery, KidsArtBench targets children's artwork and pairs multi-dimensional annotations with comment supervision to enable both ordinal assessment and formative feedback. Building on this resource, we propose an attribute-specific multi-LoRA approach, where each attribute corresponds to a distinct evaluation dimension (e.g., Realism,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Aesthetic Perception and Analysis · Generative Adversarial Networks and Image Synthesis
