Contrastive Document Representation Learning with Graph Attention   Networks

Peng Xu; Xinchi Chen; Xiaofei Ma; Zhiheng Huang; Bing Xiang

arXiv:2110.10778·cs.CL·October 22, 2021

Contrastive Document Representation Learning with Graph Attention Networks

Peng Xu, Xinchi Chen, Xiaofei Ma, Zhiheng Huang, Bing Xiang

PDF

Open Access

TL;DR

This paper introduces a novel approach combining pretrained Transformer models with graph attention networks and contrastive learning to effectively generate embeddings for very long documents, improving performance in classification and retrieval tasks.

Contribution

It presents a new graph attention network framework on top of Transformers and a contrastive pretraining strategy for long document representation learning.

Findings

01

Improved document classification accuracy

02

Enhanced document retrieval performance

03

Effective handling of long text sequences

Abstract

Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies

MethodsContrastive Learning