# Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

**Authors:** Braden Hancock, Antoine Bordes, Pierre-Emmanuel Mazar\'e, Jason Weston

arXiv: 1901.05415 · 2019-06-14

## TL;DR

This paper introduces a self-feeding chatbot that learns from real conversations by extracting training data and feedback, significantly enhancing its performance without additional supervision.

## Contribution

It presents a novel self-supervised learning approach where the chatbot improves itself by learning from its own conversations and user feedback after deployment.

## Key findings

- Performance improved on PersonaChat dataset
- Effective learning from real user interactions
- Reduces dependence on traditional supervised data

## Abstract

The majority of conversations a dialogue agent sees over its lifetime occur after it has already been trained and deployed, leaving a vast store of potential training signal untapped. In this work, we propose the self-feeding chatbot, a dialogue agent with the ability to extract new training examples from the conversations it participates in. As our agent engages in conversation, it also estimates user satisfaction in its responses. When the conversation appears to be going well, the user's responses become new training examples to imitate. When the agent believes it has made a mistake, it asks for feedback; learning to predict the feedback that will be given improves the chatbot's dialogue abilities further. On the PersonaChat chit-chat dataset with over 131k training examples, we find that learning from dialogue with a self-feeding chatbot significantly improves performance, regardless of the amount of traditional supervision.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.05415/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1901.05415/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/1901.05415/full.md

---
Source: https://tomesphere.com/paper/1901.05415