Can Large Language Models (LLMs) Describe Pictures Like Children? A Comparative Corpus Study

Hanna Woloszyn; Benjamin Gagl

arXiv:2508.13769·cs.CL·August 20, 2025

Can Large Language Models (LLMs) Describe Pictures Like Children? A Comparative Corpus Study

Hanna Woloszyn, Benjamin Gagl

PDF

TL;DR

This study compares how well large language models generate child-like descriptions of pictures, revealing significant differences in lexical richness, semantic similarity, and linguistic patterns compared to actual children's language.

Contribution

It provides a detailed psycholinguistic analysis of LLM-generated child-like language and highlights limitations in their ability to replicate authentic child speech patterns.

Findings

01

LLMs produce longer but less lexically rich texts.

02

LLMs rely more on high-frequency words and under-represent nouns.

03

Semantic similarity between LLM texts and children's descriptions is low.

Abstract

The role of large language models (LLMs) in education is increasing, yet little attention has been paid to whether LLM-generated text resembles child language. This study evaluates how LLMs replicate child-like language by comparing LLM-generated texts to a collection of German children's descriptions of picture stories. We generated two LLM-based corpora using the same picture stories and two prompt types: zero-shot and few-shot prompts specifying a general age from the children corpus. We conducted a comparative analysis across psycholinguistic text properties, including word frequency, lexical richness, sentence and word length, part-of-speech tags, and semantic similarity with word embeddings. The results show that LLM-generated texts are longer but less lexically rich, rely more on high-frequency words, and under-represent nouns. Semantic vector space analysis revealed low…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.