When Labels Have Structure: Improving Image Classification with Hierarchy-Aware Cross-Entropy

April Chan; Davide D'Ascenzo; Sebastiano Cultrera di Montesano

arXiv:2605.06274·cs.LG·May 8, 2026

When Labels Have Structure: Improving Image Classification with Hierarchy-Aware Cross-Entropy

April Chan, Davide D'Ascenzo, Sebastiano Cultrera di Montesano

PDF

TL;DR

This paper introduces Hierarchy-Aware Cross-Entropy (HACE), a loss function that leverages class hierarchies to improve image classification accuracy across various models and datasets.

Contribution

HACE is a novel, easy-to-implement loss that incorporates class hierarchy information into training, enhancing model performance.

Findings

01

HACE improves accuracy in 15 out of 18 architecture-dataset pairs.

02

HACE outperforms all baselines in linear probing on frozen features.

03

Mean accuracy gain of 4.66% in end-to-end training.

Abstract

Standard cross-entropy is the default classification loss across virtually all of machine learning, yet it treats all misclassifications equally, ignoring the semantic distances that a class hierarchy encodes. We propose Hierarchy-Aware Cross-Entropy (HACE), a drop-in replacement for standard cross-entropy that incorporates a known class hierarchy directly into the loss. HACE combines two components: prediction aggregation, which propagates the model's probability mass upward through the class hierarchy to ensure that parent nodes accumulate the confidence of their children; and ancestral label smoothing, which distributes the ground-truth signal along the path from the true class to the root. We evaluate HACE on CIFAR-100, FGVC Aircraft, and NABirds in two regimes: end-to-end training across six architectures spanning convolutional and attention-based designs, and linear probing on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.