Satisfying Real-world Goals with Dataset Constraints
Gabriel Goh, Andrew Cotter, Maya Gupta, Michael Friedlander

TL;DR
This paper introduces a method for training classifiers that simultaneously satisfy multiple real-world goals across different datasets by using dataset constraints and an efficient optimization algorithm.
Contribution
It presents a novel approach combining dataset constraints with ramp penalties to handle multiple goals, optimizing a complex non-convex problem efficiently.
Findings
Effective on benchmark datasets
Successful in real-world industry applications
Outperforms traditional single-goal training methods
Abstract
The goal of minimizing misclassification error on a training set is often just one of several real-world goals that might be defined on different datasets. For example, one may require a classifier to also make positive predictions at some specified rate for some subpopulation (fairness), or to achieve a specified empirical recall. Other real-world goals include reducing churn with respect to a previously deployed model, or stabilizing online training. In this paper we propose handling multiple goals on multiple datasets by training with dataset constraints, using the ramp penalty to accurately quantify costs, and present an efficient algorithm to approximately optimize the resulting non-convex constrained optimization problem. Experiments on both benchmark and real-world industry datasets demonstrate the effectiveness of our approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Data Stream Mining Techniques
