Exploiting Dynamic and Fine-grained Semantic Scope for Extreme Multi-label Text Classification
Yuan Wang, Huiling Song, Peng Huo, Tao Xu, Jucheng Yang, and Yarui Chen, Tingting Zhao

TL;DR
This paper introduces TReaderXML, a novel framework for extreme multi-label text classification that dynamically leverages fine-grained semantic scope from teacher knowledge to improve label prediction accuracy, especially on sparse datasets.
Contribution
The paper proposes a dynamic, fine-grained semantic scope mechanism for XMTC, utilizing teacher knowledge and a dual cooperative network to enhance label prediction performance.
Findings
Achieves state-of-the-art results on three benchmark datasets.
Performs particularly well on imbalanced and sparse datasets.
Demonstrates the effectiveness of dynamic semantic scope modeling.
Abstract
Extreme multi-label text classification (XMTC) refers to the problem of tagging a given text with the most relevant subset of labels from a large label set. A majority of labels only have a few training instances due to large label dimensionality in XMTC. To solve this data sparsity issue, most existing XMTC methods take advantage of fixed label clusters obtained in early stage to balance performance on tail labels and head labels. However, such label clusters provide static and coarse-grained semantic scope for every text, which ignores distinct characteristics of different texts and has difficulties modelling accurate semantics scope for texts with tail labels. In this paper, we propose a novel framework TReaderXML for XMTC, which adopts dynamic and fine-grained semantic scope from teacher knowledge for individual text to optimize text conditional prior category semantic ranges.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Sentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques
