Dialogue Learning With Human-In-The-Loop
Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato,, Jason Weston

TL;DR
This paper explores how conversational agents can learn and improve through human interaction using reinforcement learning, with a simulated environment and real-world validation on Mechanical Turk.
Contribution
It introduces a reinforcement learning framework for dialogue learning with human feedback and validates it through both simulation and real human interactions.
Findings
Reinforcement learning enables dialogue agents to improve from human feedback.
Simulated environment effectively tests dialogue learning models.
Real experiments confirm the approach's practicality.
Abstract
An important aspect of developing conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes. Most research has focused on learning from fixed training sets of labeled data rather than interacting with a dialogue partner in an online fashion. In this paper we explore this direction in a reinforcement learning setting where the bot improves its question-answering ability from feedback a teacher gives following its generated responses. We build a simulator that tests various aspects of such learning in a synthetic environment, and introduce models that work in this regime. Finally, real experiments with Mechanical Turk validate the approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · AI in Service Interactions · Topic Modeling
