Beyond Categorical Label Representations for Image Classification
Boyuan Chen, Yu Li, Sunand Raghupathi, Hod Lipson

TL;DR
This paper demonstrates that high-dimensional, high-entropy label representations can improve model robustness and data efficiency in image classification, challenging traditional categorical label approaches.
Contribution
It introduces the idea that complex, high-entropy label representations can enhance model robustness and training efficiency, supported by extensive experiments.
Findings
High-dimensional, high-entropy labels match categorical accuracy.
Features learned are more robust to adversarial attacks.
Better performance with limited training data.
Abstract
We find that the way we choose to represent data labels can have a profound effect on the quality of trained models. For example, training an image classifier to regress audio labels rather than traditional categorical probabilities produces a more reliable classification. This result is surprising, considering that audio labels are more complex than simpler numerical probabilities or text. We hypothesize that high dimensional, high entropy label representations are generally more useful because they provide a stronger error signal. We support this hypothesis with evidence from various label representations including constant matrices, spectrograms, shuffled spectrograms, Gaussian mixtures, and uniform random matrices of various dimensionalities. Our experiments reveal that high dimensional, high entropy labels achieve comparable accuracy to text (categorical) labels on the standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Text and Document Classification Technologies
