Data-Efficient Methods for Dialogue Systems
Igor Shalyminov

TL;DR
This paper presents data-efficient and robust methods for training dialogue systems with minimal data, including new models, data augmentation techniques, and approaches for handling disfluencies and out-of-domain inputs, improving performance and generalization.
Contribution
It introduces novel data-efficient models and techniques for dialogue systems, addressing robustness issues with minimal training data, and demonstrates their effectiveness in various dialogue tasks.
Findings
Ranked first at DSTC 8 Fast Domain Adaptation task
Third place in Amazon Alexa Prize 2017 and 2018 for social dialogue response ranking
Proposed Turn Dropout improves out-of-domain detection with in-domain data
Abstract
Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa or business-oriented solutions. Deep learning underlies many recent breakthroughs in dialogue systems but requires very large amounts of training data, often annotated by experts. Trained with smaller data, these methods end up severely lacking robustness (e.g. to disfluencies and out-of-domain input), and often just have too little generalisation power. In this thesis, we address the above issues by introducing a series of methods for training robust dialogue systems from minimal data. Firstly, we study two orthogonal approaches to dialogue: linguistically informed and machine learning-based - from the data efficiency perspective. We outline the steps to obtain data-efficient solutions with either approach. We then introduce two data-efficient models for dialogue…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Byte Pair Encoding · Label Smoothing · Residual Connection · Multi-Head Attention · Adam · Dense Connections
