Improving Matching Models with Hierarchical Contextualized Representations for Multi-turn Response Selection
Chongyang Tao, Wei Wu, Can Xu, Yansong Feng, Dongyan Zhao, Rui Yan

TL;DR
This paper introduces hierarchical contextualized representations for multi-turn response selection in chatbots, significantly improving matching models by capturing dialogue context at multiple levels.
Contribution
It proposes a novel pre-training approach for hierarchical representations that enhances multi-turn response matching in retrieval-based chatbots.
Findings
Significant improvement over existing models on benchmark datasets
Hierarchical representations outperform flat models in multi-turn settings
Effective blending of word-level and sentence-level features
Abstract
In this paper, we study context-response matching with pre-trained contextualized representations for multi-turn response selection in retrieval-based chatbots. Existing models, such as Cove and ELMo, are trained with limited context (often a single sentence or paragraph), and may not work well on multi-turn conversations, due to the hierarchical nature, informal language, and domain-specific words. To address the challenges, we propose pre-training hierarchical contextualized representations, including contextual word-level and sentence-level representations, by learning a dialogue generation model from large-scale conversations with a hierarchical encoder-decoder architecture. Then the two levels of representations are blended into the input and output layer of a matching model respectively. Experimental results on two benchmark conversation datasets indicate that the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsSigmoid Activation · Tanh Activation · ELMo · Long Short-Term Memory · GloVe Embeddings · Bidirectional LSTM · Location-based Attention · Sequence to Sequence · Softmax · Contextual Word Vectors
