ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
Wenpeng Yin, Hinrich Sch\"utze, Bing Xiang, Bowen Zhou

TL;DR
This paper introduces ABCNN, an attention-based CNN model that effectively captures mutual influence between sentence pairs, leading to state-of-the-art results across multiple NLP tasks such as answer selection, paraphrase identification, and textual entailment.
Contribution
The paper presents a versatile ABCNN model with three novel attention schemes that incorporate mutual sentence influence into CNN representations, improving performance across various tasks.
Findings
Achieved state-of-the-art results on answer selection, paraphrase identification, and textual entailment.
Demonstrated the effectiveness of attention schemes in modeling sentence pair interactions.
Validated the general applicability of ABCNN across different NLP tasks.
Abstract
How to model a pair of sentences is a critical issue in many NLP tasks such as answer selection (AS), paraphrase identification (PI) and textual entailment (TE). Most prior work (i) deals with one individual task by fine-tuning a specific system; (ii) models each sentence's representation separately, rarely considering the impact of the other sentence; or (iii) relies fully on manually designed, task-specific linguistic features. This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences. We make three contributions. (i) ABCNN can be applied to a wide variety of tasks that require modeling of sentence pairs. (ii) We propose three attention schemes that integrate mutual influence between sentences into CNN; thus, the representation of each sentence takes into consideration its counterpart. These interdependent sentence pair…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
