Contextualized Embeddings based Convolutional Neural Networks for Duplicate Question Identification
Harsh Sakhrani, Saloni Parekh, Pratik Ratadiya

TL;DR
This paper introduces a novel neural network architecture combining Transformer encoders and CNNs for question paraphrase identification, achieving state-of-the-art results efficiently on large-scale datasets.
Contribution
It proposes a new architecture integrating Bidirectional Transformer Encoders with CNNs and compares inference setups, demonstrating improved performance and insights into fine-tuning effects.
Findings
The model achieves state-of-the-art performance on Quora dataset.
Adding convolution layers improves results in both inference setups.
Matched-Aggregation setup outperforms Siamese setup consistently.
Abstract
Question Paraphrase Identification (QPI) is a critical task for large-scale Question-Answering forums. The purpose of QPI is to determine whether a given pair of questions are semantically identical or not. Previous approaches for this task have yielded promising results, but have often relied on complex recurrence mechanisms that are expensive and time-consuming in nature. In this paper, we propose a novel architecture combining a Bidirectional Transformer Encoder with Convolutional Neural Networks for the QPI task. We produce the predictions from the proposed architecture using two different inference setups: Siamese and Matched Aggregation. Experimental results demonstrate that our model achieves state-of-the-art performance on the Quora Question Pairs dataset. We empirically prove that the addition of convolution layers to the model architecture improves the results in both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Residual Connection · Dropout · Softmax · Label Smoothing
