Optimal binning: mathematical programming formulation
Guillermo Navas-Palencia

TL;DR
This paper introduces a rigorous mathematical programming approach for optimal binning across various target types, incorporating new constraints and enhancements, implemented in an open-source Python library.
Contribution
It presents a novel convex mixed-integer programming formulation for optimal binning applicable to multiple target types, with algorithmic improvements and implementation in OptBinning.
Findings
Effective binning solutions for binary, continuous, and multi-class targets.
Enhanced algorithms with automatic trend detection using machine learning.
Open-source implementation in Python library OptBinning.
Abstract
The optimal binning is the optimal discretization of a variable into bins given a discrete or continuous numeric target. We present a rigorous and extensible mathematical programming formulation for solving the optimal binning problem for a binary, continuous and multi-class target type, incorporating constraints not previously addressed. For all three target types, we introduce a convex mixed-integer programming formulation. Several algorithmic enhancements, such as automatic determination of the most suitable monotonic trend via a Machine-Learning-based classifier and implementation aspects are thoughtfully discussed. The new mathematical programming formulations are carefully implemented in the open-source python library OptBinning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Algorithms and Data Compression · Machine Learning and Algorithms
