Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-LLM Interactions

Junfeng Jiao; Saleh Afroogh; Kevin Chen; Abhejay Murali; David Atkinson; Amit Dhurandhar

arXiv:2506.13510·cs.CY·May 14, 2026

Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-LLM Interactions

Junfeng Jiao, Saleh Afroogh, Kevin Chen, Abhejay Murali, David Atkinson, Amit Dhurandhar

PDF

1 Repo

TL;DR

This paper introduces Safe-Child-LLM, a benchmark and dataset for evaluating the safety of large language models in interactions with children and adolescents, highlighting safety gaps and promoting ethical AI development.

Contribution

It presents a novel developmental benchmark and dataset specifically designed to assess LLM safety for minors, including adversarial prompts and safety evaluation metrics.

Findings

01

Leading LLMs show safety deficiencies in child-related scenarios.

02

The benchmark reveals vulnerabilities in current AI safety measures.

03

Public datasets and code are released for community use.

Abstract

As Large Language Models (LLMs) increasingly power applications used by children and adolescents, ensuring safe and age-appropriate interactions has become an urgent ethical imperative. Despite progress in AI safety, current evaluations predominantly focus on adults, neglecting the unique vulnerabilities of minors engaging with generative AI. We introduce Safe-Child-LLM, a comprehensive benchmark and dataset for systematically assessing LLM safety across two developmental stages: children (7-12) and adolescents (13-17). Our framework includes a novel multi-part dataset of 200 adversarial prompts, curated from red-teaming corpora (e.g., SG-Bench, HarmBench), with human-annotated labels for jailbreak success and a standardized 0-5 ethical refusal scale. Evaluating leading LLMs -- including ChatGPT, Claude, Gemini, LLaMA, DeepSeek, Grok, Vicuna, and Mistral -- we uncover critical safety…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

The-Responsible-AI-Initiative/Safe_Child_LLM_Benchmark.git
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.