Attention over pre-trained Sentence Embeddings for Long Document   Classification

Amine Abdaoui; Sourav Dutta

arXiv:2307.09084·cs.CL·July 19, 2023·1 cites

Attention over pre-trained Sentence Embeddings for Long Document Classification

Amine Abdaoui, Sourav Dutta

PDF

Open Access

TL;DR

This paper proposes a linear-scalable attention architecture that leverages pre-trained sentence embeddings for long document classification, achieving competitive results and better performance when using frozen transformers.

Contribution

It introduces a simple, efficient architecture combining pre-trained sentence transformers with a small attention layer for long document classification.

Findings

01

Competitive results on three datasets.

02

Better performance with frozen transformers.

03

Effective alternative to complex long-document models.

Abstract

Despite being the current de-facto models in most NLP tasks, transformers are often limited to short sequences due to their quadratic attention complexity on the number of tokens. Several attempts to address this issue were studied, either by reducing the cost of the self-attention computation or by modeling smaller sequences and combining them through a recurrence mechanism or using a new transformer model. In this paper, we suggest to take advantage of pre-trained sentence transformers to start from semantically meaningful embeddings of the individual sentences, and then combine them through a small attention layer that scales linearly with the document length. We report the results obtained by this simple architecture on three standard document classification datasets. When compared with the current state-of-the-art models using standard fine-tuning, the studied method obtains…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques