# Efficient Path Prediction for Semi-Supervised and Weakly Supervised   Hierarchical Text Classification

**Authors:** Huiru Xiao, Xin Liu, Yangqiu Song

arXiv: 1902.09347 · 2019-02-26

## TL;DR

This paper introduces a path cost-sensitive learning algorithm for hierarchical text classification that effectively utilizes unlabeled and weakly labeled data, leveraging structural information to improve accuracy and efficiency.

## Contribution

It proposes a novel path cost-sensitive learning method that incorporates class hierarchy structure and unlabeled data, reducing computational costs in semi-supervised and weakly supervised settings.

## Key findings

- Effective in leveraging unlabeled and weakly labeled data
- Reduces computational costs compared to existing methods
- Achieves strong performance on benchmark datasets

## Abstract

Hierarchical text classification has many real-world applications. However, labeling a large number of documents is costly. In practice, we can use semi-supervised learning or weakly supervised learning (e.g., dataless classification) to reduce the labeling cost. In this paper, we propose a path cost-sensitive learning algorithm to utilize the structural information and further make use of unlabeled and weakly-labeled data. We use a generative model to leverage the large amount of unlabeled data and introduce path constraints into the learning algorithm to incorporate the structural information of the class hierarchy. The posterior probabilities of both unlabeled and weakly labeled data can be incorporated with path-dependent scores. Since we put a structure-sensitive cost to the learning algorithm to constrain the classification consistent with the class hierarchy and do not need to reconstruct the feature vectors for different structures, we can significantly reduce the computational cost compared to structural output learning. Experimental results on two hierarchical text classification benchmarks show that our approach is not only effective but also efficient to handle the semi-supervised and weakly supervised hierarchical text classification.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.09347/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1902.09347/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1902.09347/full.md

---
Source: https://tomesphere.com/paper/1902.09347