Characterizing the visual representation of objects from the child's view

Jane Yang; Tarun Sepuri; Alvin Wei Ming Tan; Khai Loong Aw; Michael C. Frank; Bria Long

arXiv:2605.14990·cs.CV·May 15, 2026

Characterizing the visual representation of objects from the child's view

Jane Yang, Tarun Sepuri, Alvin Wei Ming Tan, Khai Loong Aw, Michael C. Frank, Bria Long

PDF

TL;DR

This study analyzes young children's visual experiences through first-person videos, revealing skewed object category exposure, high variability in exemplars, and strong superordinate category groupings despite diverse views.

Contribution

It provides a detailed characterization of children's visual input, highlighting the importance of superordinate structures for robust object category learning.

Findings

01

Children's visual experience is highly skewed towards a few categories.

02

Objects are viewed from unusual angles and in cluttered, occluded scenes.

03

Detected categories show stronger superordinate groupings than canonical images.

Abstract

Children acquire object category representations from their everyday experiences in the first few years of life. What do the inputs to this learning process look like? We analyzed first-person videos of young children's visual experience at home from the BabyView dataset ( $N$ = 31 participants, 868 hours, ages 5--36 months), using a supervised object detection model to extract common object categories from more than 3 million frames. We found that children's object category exposure was highly skewed: a few categories (e.g., cups, chairs) dominated children's visual experiences while most categories appeared rarely, replicating previous findings from a more restricted set of contexts. Category exemplars were highly variable: children encountered objects from unusual angles, in highly cluttered scenes, and partially occluded views; many categories (especially animals) were most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.