AARGH! End-to-end Retrieval-Generation for Task-Oriented Dialog
Tom\'a\v{s} Nekvinda, Ond\v{r}ej Du\v{s}ek

TL;DR
AARGH is an innovative end-to-end dialog system that combines retrieval and generation to enhance diversity and performance in task-oriented conversations, using a shared-parameter architecture and action-aware training.
Contribution
It introduces a novel retrieval-enhanced generation model with a unified architecture and action-aware training, improving diversity and accuracy over existing methods.
Findings
Produces more diverse responses than baselines
Maintains or improves state tracking accuracy
Enhances context-to-response generation quality
Abstract
We introduce AARGH, an end-to-end task-oriented dialog system combining retrieval and generative approaches in a single model, aiming at improving dialog management and lexical diversity of outputs. The model features a new response selection method based on an action-aware training objective and a simplified single-encoder retrieval architecture which allow us to build an end-to-end retrieval-enhanced generation model where retrieval and generation share most of the parameters. On the MultiWOZ dataset, we show that our approach produces more diverse outputs while maintaining or improving state tracking and context-to-response generation performance, compared to state-of-the-art baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications
