Consistent optimization of AMS by logistic loss minimization
Wojciech Kot{\l}owski

TL;DR
This paper provides a theoretical justification for using logistic loss minimization as a consistent method to optimize the approximate median significance (AMS) in binary classification tasks, especially in high-energy physics applications.
Contribution
It proves that minimizing logistic loss leads to consistent optimization of AMS through a two-stage procedure involving threshold tuning on a validation set.
Findings
Logistic loss minimization bounds AMS regret.
The approach is validated theoretically for the Higgs Boson challenge.
Threshold tuning on validation data enhances AMS optimization.
Abstract
In this paper, we theoretically justify an approach popular among participants of the Higgs Boson Machine Learning Challenge to optimize approximate median significance (AMS). The approach is based on the following two-stage procedure. First, a real-valued function is learned by minimizing a surrogate loss for binary classification, such as logistic loss, on the training sample. Then, a threshold is tuned on a separate validation sample, by direct optimization of AMS. We show that the regret of the resulting (thresholded) classifier measured with respect to the squared AMS, is upperbounded by the regret of the underlying real-valued function measured with respect to the logistic loss. Hence, we prove that minimizing logistic surrogate is a consistent method of optimizing AMS.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research
