Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models
Lennart St\"opler, Rufat Asadli, Mitja Nikolaus, Ryan Cotterell, Alex Warstadt

TL;DR
This paper introduces a child-inspired interactive training method for language models using communicative success as a reward, aiming to enhance language learning through single-turn dialogues.
Contribution
It operationalizes communicative success in an abstract language-only setting and demonstrates its potential as an indirect grammaticality signal for reinforcement learning.
Findings
Reward correlates with grammaticality.
Interpretable changes in speaker behavior observed.
No significant improvements in linguistic evaluation metrics.
Abstract
We propose a method for training language models in an interactive setting inspired by child language acquisition. In our setting, a speaker attempts to communicate some information to a listener in a single-turn dialogue and receives a reward if communicative success is achieved. Unlike earlier related work using image--caption data for interactive reference games, we operationalize communicative success in a more abstract language-only question--answering setting. First, we present a feasibility study demonstrating that our reward provides an indirect signal about grammaticality. Second, we conduct experiments using reinforcement learning to fine-tune language models. We observe that cognitively plausible constraints on the communication channel lead to interpretable changes in speaker behavior. However, we do not yet see improvements on linguistic evaluations from our training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Topic Modeling
