A Manually Annotated Image-Caption Dataset for Detecting Children in the Wild

Klim Kireev; Ana-Maria Cre\c{t}u; Raphael Meier; Sarah Adel Bargal; Elissa Redmiles; Carmela Troncoso

arXiv:2506.10117·cs.CV·June 13, 2025

A Manually Annotated Image-Caption Dataset for Detecting Children in the Wild

Klim Kireev, Ana-Maria Cre\c{t}u, Raphael Meier, Sarah Adel Bargal, Elissa Redmiles, Carmela Troncoso

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces the ICCWD, a comprehensive image-caption dataset for benchmarking child detection methods in diverse scenarios, highlighting the challenges and potential for improving automated minor detection tools.

Contribution

We created and released the ICCWD dataset, the first of its kind, to facilitate benchmarking of multi-modal child detection methods in varied contexts.

Findings

01

Child detection remains challenging with a best true positive rate of 75.3%.

02

The dataset enables evaluation of different detection approaches.

03

Commercial age estimation systems show limited accuracy in this task.

Abstract

Platforms and the law regulate digital content depicting minors (defined as individuals under 18 years of age) differently from other types of content. Given the sheer amount of content that needs to be assessed, machine learning-based automation tools are commonly used to detect content depicting minors. To our knowledge, no dataset or benchmark currently exists for detecting these identification methods in a multi-modal environment. To fill this gap, we release the Image-Caption Children in the Wild Dataset (ICCWD), an image-caption dataset aimed at benchmarking tools that detect depictions of minors. Our dataset is richer than previous child image datasets, containing images of children in a variety of contexts, including fictional depictions and partially visible bodies. ICCWD contains 10,000 image-caption pairs manually labeled to indicate the presence or absence of a child in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

spring-epfl/iccwd
noneOfficial

Datasets

amcretu/iccwd
dataset· 13 dl
13 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Face recognition and analysis · Advanced Image and Video Retrieval Techniques