Algebraic Ground Truth Inference: Non-Parametric Estimation of Sample Errors by AI Algorithms
Andr\'es Corrada-Emmanuel, Edward Pantridge, Edward Zahrebelski, and Aditya Chaganti, Simeon Simeonov

TL;DR
This paper introduces an algebraic geometry-based non-parametric method for estimating classifier errors in real-world, privacy-constrained settings without ground truth, demonstrated on advertising and benchmark datasets.
Contribution
It develops an exact polynomial system formulation for classifier error self-assessment, revealing conditions for solutions and practical utility in real-world data analysis.
Findings
Accuracy better than 1% when ground truth is available
Method effective in privacy-constrained environments
Verified consistency approach in online advertising data
Abstract
Binary classification is widely used in ML production systems. Monitoring classifiers in a constrained event space is well known. However, real world production systems often lack the ground truth these methods require. Privacy concerns may also require that the ground truth needed to evaluate the classifiers cannot be made available. In these autonomous settings, non-parametric estimators of performance are an attractive solution. They do not require theoretical models about how the classifiers made errors in any given sample. They just estimate how many errors there are in a sample of an industrial or robotic datastream. We construct one such non-parametric estimator of the sample errors for an ensemble of weak binary classifiers. Our approach uses algebraic geometry to reformulate the self-assessment problem for ensembles of binary classifiers as an exact polynomial system. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Mobile Crowdsensing and Crowdsourcing · Anomaly Detection Techniques and Applications
