Cascaded Semantic and Positional Self-Attention Network for Document   Classification

Juyong Jiang; Jie Zhang; Kai Zhang

arXiv:2009.07148·cs.CL·September 22, 2020

Cascaded Semantic and Positional Self-Attention Network for Document Classification

Juyong Jiang, Jie Zhang, Kai Zhang

PDF

Open Access

TL;DR

This paper introduces CSPAN, a novel architecture combining semantic and positional information through cascaded self-attention and Bi-LSTM for improved document classification.

Contribution

The paper presents a new cascaded semantic and positional self-attention network that adaptively integrates semantic and positional data, outperforming traditional positional encoding methods.

Findings

01

CSPAN improves classification accuracy on benchmark datasets.

02

The model maintains a compact size and converges quickly.

03

It offers more interpretable interaction between semantics and positions.

Abstract

Transformers have shown great success in learning representations for language modelling. However, an open challenge still remains on how to systematically aggregate semantic information (word embedding) with positional (or temporal) information (word orders). In this work, we propose a new architecture to aggregate the two sources of information using cascaded semantic and positional self-attention network (CSPAN) in the context of document classification. The CSPAN uses a semantic self-attention layer cascaded with Bi-LSTM to process the semantic and positional information in a sequential manner, and then adaptively combine them together through a residue connection. Compared with commonly used positional encoding schemes, CSPAN can exploit the interaction between semantics and word positions in a more interpretable and adaptive manner, and the classification performance can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies