Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction
Yi Tay, Luu Anh Tuan, Siu Cheung Hui

TL;DR
This paper introduces Multi-Cast Attention Networks (MCAN), a novel attention-based model architecture that enhances feature representation in retrieval-based question answering and response prediction tasks, achieving state-of-the-art results.
Contribution
The paper proposes a new attention mechanism called casted attention, allowing multiple attention types to be combined simultaneously for improved representation learning.
Findings
MCAN outperforms existing models by 9% on the Ubuntu Dialogue Corpus.
MCAN achieves the best score to date on the TrecQA dataset.
The approach enhances explainability and reduces tuning complexity.
Abstract
Attention is typically used to select informative sub-phrases that are used for prediction. This paper investigates the novel use of attention as a form of feature augmentation, i.e, casted attention. We propose Multi-Cast Attention Networks (MCAN), a new attention mechanism and general model architecture for a potpourri of ranking tasks in the conversational modeling and question answering domains. Our approach performs a series of soft attention operations, each time casting a scalar feature upon the inner word embeddings. The key idea is to provide a real-valued hint (feature) to a subsequent encoder layer and is targeted at improving the representation learning process. There are several advantages to this design, e.g., it allows an arbitrary number of attention mechanisms to be casted, allowing for multiple attention types (e.g., co-attention, intra-attention) and attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Expert finding and Q&A systems
