Contextualized Embeddings based Convolutional Neural Networks for   Duplicate Question Identification

Harsh Sakhrani; Saloni Parekh; Pratik Ratadiya

arXiv:2109.01560·cs.CL·September 7, 2021

Contextualized Embeddings based Convolutional Neural Networks for Duplicate Question Identification

Harsh Sakhrani, Saloni Parekh, Pratik Ratadiya

PDF

Open Access

TL;DR

This paper introduces a novel neural network architecture combining Transformer encoders and CNNs for question paraphrase identification, achieving state-of-the-art results efficiently on large-scale datasets.

Contribution

It proposes a new architecture integrating Bidirectional Transformer Encoders with CNNs and compares inference setups, demonstrating improved performance and insights into fine-tuning effects.

Findings

01

The model achieves state-of-the-art performance on Quora dataset.

02

Adding convolution layers improves results in both inference setups.

03

Matched-Aggregation setup outperforms Siamese setup consistently.

Abstract

Question Paraphrase Identification (QPI) is a critical task for large-scale Question-Answering forums. The purpose of QPI is to determine whether a given pair of questions are semantically identical or not. Previous approaches for this task have yielded promising results, but have often relied on complex recurrence mechanisms that are expensive and time-consuming in nature. In this paper, we propose a novel architecture combining a Bidirectional Transformer Encoder with Convolutional Neural Networks for the QPI task. We produce the predictions from the proposed architecture using two different inference setups: Siamese and Matched Aggregation. Experimental results demonstrate that our model achieves state-of-the-art performance on the Quora Question Pairs dataset. We empirically prove that the addition of convolution layers to the model architecture improves the results in both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Residual Connection · Dropout · Softmax · Label Smoothing