When Labels Have Structure: Improving Image Classification with Hierarchy-Aware Cross-Entropy
April Chan, Davide D'Ascenzo, Sebastiano Cultrera di Montesano

TL;DR
This paper introduces Hierarchy-Aware Cross-Entropy (HACE), a loss function that leverages class hierarchies to improve image classification accuracy across various models and datasets.
Contribution
HACE is a novel, easy-to-implement loss that incorporates class hierarchy information into training, enhancing model performance.
Findings
HACE improves accuracy in 15 out of 18 architecture-dataset pairs.
HACE outperforms all baselines in linear probing on frozen features.
Mean accuracy gain of 4.66% in end-to-end training.
Abstract
Standard cross-entropy is the default classification loss across virtually all of machine learning, yet it treats all misclassifications equally, ignoring the semantic distances that a class hierarchy encodes. We propose Hierarchy-Aware Cross-Entropy (HACE), a drop-in replacement for standard cross-entropy that incorporates a known class hierarchy directly into the loss. HACE combines two components: prediction aggregation, which propagates the model's probability mass upward through the class hierarchy to ensure that parent nodes accumulate the confidence of their children; and ancestral label smoothing, which distributes the ground-truth signal along the path from the true class to the root. We evaluate HACE on CIFAR-100, FGVC Aircraft, and NABirds in two regimes: end-to-end training across six architectures spanning convolutional and attention-based designs, and linear probing on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
