SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine Teaching
Baolin Peng, Chunyuan Li, Jinchao Li, Shahin Shayandeh and, Lars Liden, Jianfeng Gao

TL;DR
SOLOIST introduces a scalable approach for building task-oriented dialogue systems using transfer learning and machine teaching, achieving state-of-the-art results with minimal task-specific data.
Contribution
The paper presents a Transformer-based pre-trained model that unifies dialog modules and enables efficient adaptation to new tasks with few examples.
Findings
Achieves state-of-the-art on CamRest676 and MultiWOZ benchmarks.
Significantly outperforms existing methods in few-shot settings.
Reduces labeling costs through machine teaching.
Abstract
We present a new method SOLOIST that uses transfer learning and machine teaching to build task bots at scale. We parameterize classical modular task-oriented dialog systems using a Transformer-based auto-regressive language model, which subsumes different dialog modules into a single neural model. We pre-train, on heterogeneous dialog corpora, a task-grounded response generation model, which can generate dialog responses grounded in user goals and real-world knowledge for task completion. The pre-trained model can be efficiently adapted to accomplish new tasks with a handful of task-specific dialogs via machine teaching, where training samples are generated by human teachers interacting with the system. Experiments show that (i) SOLOIST creates new state-of-the-art on well-studied task-oriented dialog benchmarks, including CamRest676 and MultiWOZ; (ii) in the few-shot fine-tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
