Transformer-based Models for Long-Form Document Matching: Challenges and   Empirical Analysis

Akshita Jha; Adithya Samavedhi; Vineeth Rakesh; Jaideep Chandrashekar,; Chandan K. Reddy

arXiv:2302.03765·cs.CL·February 9, 2023

Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis

Akshita Jha, Adithya Samavedhi, Vineeth Rakesh, Jaideep Chandrashekar,, Chandan K. Reddy

PDF

Open Access

TL;DR

This paper compares transformer-based models with simple neural models for long document matching, showing that simpler models often outperform complex transformers in efficiency, robustness, and resource consumption.

Contribution

It empirically demonstrates that simple neural models can outperform transformer-based models in long document matching tasks, challenging current assumptions.

Findings

01

Simple models outperform BERT-based models in accuracy for document matching.

02

Simple models require less training time, energy, and memory.

03

Simple models are more robust to document length variations and text perturbations.

Abstract

Recent advances in the area of long document matching have primarily focused on using transformer-based models for long document encoding and matching. There are two primary challenges associated with these models. Firstly, the performance gain provided by transformer-based models comes at a steep cost - both in terms of the required training time and the resource (memory and energy) consumption. The second major limitation is their inability to handle more than a pre-defined input token length at a time. In this work, we empirically demonstrate the effectiveness of simple neural models (such as feed-forward networks, and CNNs) and simple embeddings (like GloVe, and Paragraph Vector) over transformer-based models on the task of document matching. We show that simple models outperform the more complex BERT-based models while taking significantly less training time, energy, and memory.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Topic Modeling · Natural Language Processing Techniques

MethodsGloVe Embeddings