Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations
Vihari Piratla, Soumen Chakrabarty, Sunita Sarawagi

TL;DR
This paper introduces AAA, a Gaussian Process-based method to evaluate the accuracy surface of black-box classifiers across diverse attribute combinations, addressing challenges of heteroscedastic uncertainty and data sparsity.
Contribution
The paper proposes novel enhancements to Gaussian Process modeling for accuracy surfaces, including pooling observations and regularizing Beta scale parameters, improving estimation and exploration.
Findings
AAA accurately estimates accuracy surfaces across attribute combinations.
Enhanced GP approach effectively manages heteroscedastic uncertainty.
Method demonstrates superior exploration efficiency in experiments.
Abstract
Our goal is to evaluate the accuracy of a black-box classification model, not as a single aggregate on a given test data distribution, but as a surface over a large number of combinations of attributes characterizing multiple test data distributions. Such attributed accuracy measures become important as machine learning models get deployed as a service, where the training data distribution is hidden from clients, and different clients may be interested in diverse regions of the data distribution. We present Attributed Accuracy Assay (AAA)--a Gaussian Process (GP)--based probabilistic estimator for such an accuracy surface. Each attribute combination, called an 'arm', is associated with a Beta density from which the service's accuracy is sampled. We expect the GP to smooth the parameters of the Beta density over related arms to mitigate sparsity. We show that obvious application of GPs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
Methodstravel james · Greedy Policy Search · Gaussian Process
