Your Next State-of-the-Art Could Come from Another Domain: A Cross-Domain Analysis of Hierarchical Text Classification
Nan Li, Bo Kang, Tijl De Bie

TL;DR
This paper provides a comprehensive cross-domain analysis of hierarchical text classification methods, introducing a unified framework and demonstrating that techniques from one domain can achieve state-of-the-art results in others.
Contribution
It offers the first unified overview and empirical analysis of hierarchical text classification across multiple domains, highlighting the importance of cross-domain learning.
Findings
Achieved new state-of-the-art results using cross-domain techniques.
Provided guidelines for effective hierarchical text classification.
Established a unified evaluation framework for the field.
Abstract
Text classification with hierarchical labels is a prevalent and challenging task in natural language processing. Examples include assigning ICD codes to patient records, tagging patents into IPC classes, assigning EUROVOC descriptors to European legal texts, and more. Despite its widespread applications, a comprehensive understanding of state-of-the-art methods across different domains has been lacking. In this paper, we provide the first comprehensive cross-domain overview with empirical analysis of state-of-the-art methods. We propose a unified framework that positions each method within a common structure to facilitate research. Our empirical analysis yields key insights and guidelines, confirming the necessity of learning across different research areas to design effective methods. Notably, under our unified evaluation pipeline, we achieved new state-of-the-art results by applying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
