On Horizontal and Vertical Separation in Hierarchical Text Classification
Mostafa Dehghani, Hosein Azarbonyad, Jaap Kamps, Maarten Marx

TL;DR
This paper investigates the importance of separability in hierarchical text classification, proposing models that improve accuracy by considering entity positions both within and across hierarchy levels.
Contribution
It introduces the Strong Separation Principle and Hierarchical Significant Words Language Models (HSWLM) for better hierarchical data representation and classification.
Findings
HSWLM captures essential hierarchical features
Improved classification accuracy demonstrated on real data
Models are transferable over time
Abstract
Hierarchy is a common and effective way of organizing data and representing their relationships at different levels of abstraction. However, hierarchical data dependencies cause difficulties in the estimation of "separable" models that can distinguish between the entities in the hierarchy. Extracting separable models of hierarchical entities requires us to take their relative position into account and to consider the different types of dependencies in the hierarchy. In this paper, we present an investigation of the effect of separability in text-based entity classification and argue that in hierarchical classification, a separation property should be established between entities not only in the same layer, but also in different layers. Our main findings are the followings. First, we analyse the importance of separability on the data representation in the task of classification and based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
