Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning)

Francesca Padovani; Bastian Bunzeck; Manar Ali; Omar Momen; Arianna Bisazza; Hendrik Buschmeier; Sina Zarrie{\ss}

arXiv:2510.20358·cs.CL·December 2, 2025

Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning)

Francesca Padovani, Bastian Bunzeck, Manar Ali, Omar Momen, Arianna Bisazza, Hendrik Buschmeier, Sina Zarrie{\ss}

PDF

Open Access 1 Video

TL;DR

This paper explores the limitations of pre-training small language models solely on dialogue data, showing that while they excel in dialogue tasks, they underperform on standard benchmarks, with fine-tuning strategies offering mixed results.

Contribution

It introduces a dialogue-focused pre-training approach and evaluates various fine-tuning methods, highlighting challenges in achieving general language understanding from dialogue data alone.

Findings

01

Models excel at dialogue continuation prediction.

02

Pre-training on dialogue data alone underperforms on standard benchmarks.

03

DPO fine-tuning improves dialogue benchmark performance.

Abstract

We investigate whether pre-training exclusively on dialogue data results in formally and functionally apt small language models. Based on this pre-trained llamalogue model, we employ a variety of fine-tuning strategies to enforce "more communicative" text generations by our models. Although our models underperform on most standard BabyLM benchmarks, they excel at dialogue continuation prediction in a minimal pair setting. While PPO fine-tuning has mixed to adversarial effects on our models, DPO fine-tuning further improves their performance on our custom dialogue benchmark.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning)· underline

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications