Computationally Efficient Wasserstein Loss for Structured Labels
Ayato Toyokuni, Sho Yokoi, Hisashi Kashima, Makoto Yamada

TL;DR
This paper introduces a tree-Wasserstein distance regularized label distribution learning algorithm that efficiently incorporates hierarchical label structures, improving training speed and memory usage in structured label prediction tasks.
Contribution
It presents a novel neural network approach for hierarchical label prediction using tree-Wasserstein distance, offering computational advantages over existing methods.
Findings
Effective consideration of label hierarchy during training
Favorable comparison with Sinkhorn algorithm in speed and memory
Successful application to synthetic and real-world datasets
Abstract
The problem of estimating the probability distribution of labels has been widely studied as a label distribution learning (LDL) problem, whose applications include age estimation, emotion analysis, and semantic segmentation. We propose a tree-Wasserstein distance regularized LDL algorithm, focusing on hierarchical text classification tasks. We propose predicting the entire label hierarchy using neural networks, where the similarity between predicted and true labels is measured using the tree-Wasserstein distance. Through experiments using synthetic and real-world datasets, we demonstrate that the proposed method successfully considers the structure of labels during training, and it compares favorably with the Sinkhorn algorithm in terms of computation time and memory usage.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications
