How many faces can be recognized? Performance extrapolation for multi-class classification
Charles Y. Zheng, Rakesh Achanta, and Yuval Benjamini

TL;DR
This paper develops a theoretical framework for predicting how multi-class classifiers' accuracy scales with the number of classes, based on data from a subset of classes and assumptions about classifier type.
Contribution
It introduces a method to extrapolate classifier performance to larger class sets using moments of a conditional accuracy distribution under exchangeability and generative assumptions.
Findings
Theoretical foundation for performance extrapolation in multi-class classification.
Method applicable to generative classifiers like QDA and Naive Bayes.
Robustness demonstrated through simulations and OCR example.
Abstract
The difficulty of multi-class classification generally increases with the number of classes. Using data from a subset of the classes, can we predict how well a classifier will scale with an increased number of classes? Under the assumption that the classes are sampled exchangeably, and under the assumption that the classifier is generative (e.g. QDA or Naive Bayes), we show that the expected accuracy when the classifier is trained on classes is the st moment of a \emph{conditional accuracy distribution}, which can be estimated from data. This provides the theoretical foundation for performance extrapolation based on pseudolikelihood, unbiased estimation, and high-dimensional asymptotics. We investigate the robustness of our methods to non-generative classifiers in simulations and one optical character recognition example.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Generative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques
