Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis
Akshita Jha, Adithya Samavedhi, Vineeth Rakesh, Jaideep Chandrashekar,, Chandan K. Reddy

TL;DR
This paper compares transformer-based models with simple neural models for long document matching, showing that simpler models often outperform complex transformers in efficiency, robustness, and resource consumption.
Contribution
It empirically demonstrates that simple neural models can outperform transformer-based models in long document matching tasks, challenging current assumptions.
Findings
Simple models outperform BERT-based models in accuracy for document matching.
Simple models require less training time, energy, and memory.
Simple models are more robust to document length variations and text perturbations.
Abstract
Recent advances in the area of long document matching have primarily focused on using transformer-based models for long document encoding and matching. There are two primary challenges associated with these models. Firstly, the performance gain provided by transformer-based models comes at a steep cost - both in terms of the required training time and the resource (memory and energy) consumption. The second major limitation is their inability to handle more than a pre-defined input token length at a time. In this work, we empirically demonstrate the effectiveness of simple neural models (such as feed-forward networks, and CNNs) and simple embeddings (like GloVe, and Paragraph Vector) over transformer-based models on the task of document matching. We show that simple models outperform the more complex BERT-based models while taking significantly less training time, energy, and memory.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Topic Modeling · Natural Language Processing Techniques
MethodsGloVe Embeddings
