OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification
Jie Cao, Yin Zhang

TL;DR
OTSeq2Set introduces an autoregressive sequence-to-set model for extreme multi-label text classification, leveraging optimal transport for permutation-invariant label prediction, outperforming existing methods on large-scale benchmarks.
Contribution
The paper proposes OTSeq2Set, a novel sequence-to-set model using optimal transport and bipartite matching to handle unordered label sets in XMTC tasks.
Findings
Outperforms baselines on 4 benchmark datasets
Achieves 16.34% improvement in micro-F1 on Wikipedia dataset
Demonstrates effectiveness of optimal transport in label prediction
Abstract
Extreme multi-label text classification (XMTC) is the task of finding the most relevant subset labels from an extremely large-scale label collection. Recently, some deep learning models have achieved state-of-the-art results in XMTC tasks. These models commonly predict scores for all labels by a fully connected layer as the last layer of the model. However, such models can't predict a relatively complete and variable-length label subset for each document, because they select positive labels relevant to the document by a fixed threshold or take top k labels in descending order of scores. A less popular type of deep learning models called sequence-to-sequence (Seq2Seq) focus on predicting variable-length positive labels in sequence style. However, the labels in XMTC tasks are essentially an unordered set rather than an ordered sequence, the default order of labels restrains Seq2Seq models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Sentiment Analysis and Opinion Mining
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
