The Tree Loss: Improving Generalization with Many Classes

Yujie Wang; Mike Izbicki

arXiv:2204.07727·cs.LG·April 19, 2022

The Tree Loss: Improving Generalization with Many Classes

Yujie Wang, Mike Izbicki

PDF

Open Access

TL;DR

The paper introduces the tree loss, a new loss function for multi-class classification that enforces semantic similarity among class parameters, leading to improved generalization over standard cross entropy loss.

Contribution

The paper proposes the tree loss as a drop-in replacement for cross entropy, ensuring similar classes have similar parameters and demonstrating its theoretical and empirical advantages.

Findings

01

Tree loss guarantees similar classes have similar parameters.

02

Theoretical analysis shows better asymptotic generalization error.

03

Empirical validation on CIFAR100, ImageNet, and Twitter datasets.

Abstract

Multi-class classification problems often have many semantically similar classes. For example, 90 of ImageNet's 1000 classes are for different breeds of dog. We should expect that these semantically similar classes will have similar parameter vectors, but the standard cross entropy loss does not enforce this constraint. We introduce the tree loss as a drop-in replacement for the cross entropy loss. The tree loss re-parameterizes the parameter matrix in order to guarantee that semantically similar classes will have similar parameter vectors. Using simple properties of stochastic gradient descent, we show that the tree loss's generalization error is asymptotically better than the cross entropy loss's. We then validate these theoretical results on synthetic data, image data (CIFAR100, ImageNet), and text data (Twitter).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · AI in cancer detection