Massively Scaling Heteroscedastic Classifiers
Mark Collier, Rodolphe Jenatton, Basil Mustafa, Neil Houlsby, Jesse, Berent, Effrosyni Kokiopoulou

TL;DR
HET-XL is a scalable heteroscedastic classifier that reduces parameter count and eliminates the need for temperature tuning, enabling effective large-scale image classification with billions of classes.
Contribution
We introduce HET-XL, a heteroscedastic classifier with class-independent parameters and learned temperature, improving scalability and performance on large datasets.
Findings
Requires 14X fewer parameters than baseline
Performs better without temperature tuning
Effective on datasets with up to 4 billion images
Abstract
Heteroscedastic classifiers, which learn a multivariate Gaussian distribution over prediction logits, have been shown to perform well on image classification problems with hundreds to thousands of classes. However, compared to standard classifiers, they introduce extra parameters that scale linearly with the number of classes. This makes them infeasible to apply to larger-scale problems. In addition heteroscedastic classifiers introduce a critical temperature hyperparameter which must be tuned. We propose HET-XL, a heteroscedastic classifier whose parameter count when compared to a standard classifier scales independently of the number of classes. In our large-scale settings, we show that we can remove the need to tune the temperature hyperparameter, by directly learning it on the training data. On large image classification datasets with up to 4B images and 30k classes our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Digital Imaging for Blood Diseases
MethodsContrastive Learning
