Constrained Sequence-to-Tree Generation for Hierarchical Text Classification
Chao Yu, Yi Shen, Yue Mao, Longjun Cai

TL;DR
This paper introduces Seq2Tree, a sequence-to-tree framework with constrained decoding for hierarchical text classification, improving label consistency and performance over previous flat classification methods.
Contribution
The paper proposes a novel sequence-to-tree approach with dynamic vocabulary decoding to better model hierarchical labels in text classification.
Findings
Significant improvements on three benchmark datasets.
Enhanced label consistency in hierarchical classification.
Outperforms previous flat classification approaches.
Abstract
Hierarchical Text Classification (HTC) is a challenging task where a document can be assigned to multiple hierarchically structured categories within a taxonomy. The majority of prior studies consider HTC as a flat multi-label classification problem, which inevitably leads to "label inconsistency" problem. In this paper, we formulate HTC as a sequence generation task and introduce a sequence-to-tree framework (Seq2Tree) for modeling the hierarchical label structure. Moreover, we design a constrained decoding strategy with dynamic vocabulary to secure the label consistency of the results. Compared with previous works, the proposed approach achieves significant and consistent improvements on three benchmark datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsFeature Pyramid Network · 1x1 Convolution · Convolution · Region Proposal Network · RoIAlign · Hybrid Task Cascade
