TL;DR
This paper introduces a generalized hierarchical transformer framework for task-oriented dialog systems, improving context understanding by adapting standard transformers with specialized attention masks and positional encodings.
Contribution
It presents a novel framework that transforms standard transformers into hierarchical encoders, enhancing dialog context modeling in task-oriented systems.
Findings
Hierarchical transformers outperform standard models in context understanding.
The framework unifies various hierarchical models like HRED and HIBERT.
Experiments show improved natural language understanding in dialog tasks.
Abstract
Generative models for dialog systems have gained much interest because of the recent success of RNN and Transformer based models in tasks like question answering and summarization. Although the task of dialog response generation is generally seen as a sequence-to-sequence (Seq2Seq) problem, researchers in the past have found it challenging to train dialog systems using the standard Seq2Seq models. Therefore, to help the model learn meaningful utterance and conversation level features, Sordoni et al. (2015b); Serban et al. (2016) proposed Hierarchical RNN architecture, which was later adopted by several other RNN based dialog systems. With the transformer-based models dominating the seq2seq problems lately, the natural question to ask is the applicability of the notion of hierarchy in transformer based dialog systems. In this paper, we propose a generalized framework for Hierarchical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Tanh Activation · Sigmoid Activation · Byte Pair Encoding · Dropout · Softmax · Multi-Head Attention · Residual Connection
