A Route Confidence Evaluation Method for Reliable Hierarchical Text Categorization
Nima Hatami, Camelia Chira, Giuliano Armano

TL;DR
This paper introduces a confidence evaluation method for hierarchical text categorization that improves accuracy by assessing route reliability and rejecting uncertain samples, validated on the Reuters dataset.
Contribution
It proposes a novel route confidence evaluation approach that incorporates hierarchy-based weights and an acceptance/rejection strategy to enhance HTC performance.
Findings
Improves categorization accuracy by rejecting low-reliability samples.
Effective on Reuters RCV1-v2 dataset, outperforming state-of-the-art methods.
Utilizes hierarchy-aware weighting for confidence assessment.
Abstract
Hierarchical Text Categorization (HTC) is becoming increasingly important with the rapidly growing amount of text data available in the World Wide Web. Among the different strategies proposed to cope with HTC, the Local Classifier per Node (LCN) approach attains good performance by mirroring the underlying class hierarchy while enforcing a top-down strategy in the testing step. However, the problem of embedding hierarchical information (parent-child relationship) to improve the performance of HTC systems still remains open. A confidence evaluation method for a selected route in the hierarchy is proposed to evaluate the reliability of the final candidate labels in an HTC system. In order to take into account the information embedded in the hierarchy, weight factors are used to take into account the importance of each level. An acceptance/rejection strategy in the top-down decision making…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Spam and Phishing Detection · Web Data Mining and Analysis
