Exploiting Local and Global Features in Transformer-based Extreme   Multi-label Text Classification

Ruohong Zhang; Yau-Shian Wang; Yiming Yang; Tom Vu; Likun Lei

arXiv:2204.00933·cs.CL·April 5, 2022·1 cites

Exploiting Local and Global Features in Transformer-based Extreme Multi-label Text Classification

Ruohong Zhang, Yau-Shian Wang, Yiming Yang, Tom Vu, Likun Lei

PDF

Open Access

TL;DR

This paper introduces a method that combines local word-level and global document features from Transformer models to enhance extreme multi-label text classification performance, addressing limitations of using only global features.

Contribution

The paper proposes a novel approach integrating local and global features from Transformers for improved XMTC, demonstrating competitive results on benchmark datasets.

Findings

01

Outperforms or matches state-of-the-art methods

02

Effective use of combined local and global features

03

Improves classification accuracy in XMTC tasks

Abstract

Extreme multi-label text classification (XMTC) is the task of tagging each document with the relevant labels from a very large space of predefined categories. Recently, large pre-trained Transformer models have made significant performance improvements in XMTC, which typically use the embedding of the special CLS token to represent the entire document semantics as a global feature vector, and match it against candidate labels. However, we argue that such a global feature vector may not be sufficient to represent different granularity levels of semantics in the document, and that complementing it with the local word-level features could bring additional gains. Based on this insight, we propose an approach that combines both the local and global features produced by Transformer models to improve the prediction power of the classifier. Our experiments show that the proposed model either…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Sentiment Analysis and Opinion Mining · Spam and Phishing Detection

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Dropout · Softmax · Layer Normalization · Label Smoothing · Byte Pair Encoding · Position-Wise Feed-Forward Layer