Bounded Guaranteed Algorithms for Concave Impurity Minimization Via Maximum Likelihood
Thuan Nguyen, Hoang Le, Thinh Nguyen

TL;DR
This paper introduces a low-complexity partitioning algorithm with bounded guarantees for impurity minimization, based on maximum likelihood, achieving state-of-the-art approximation bounds for Gini index and entropy.
Contribution
It constructs bounds for impurity functions and proposes a new algorithm with guaranteed performance, extending theoretical bounds like Fano's inequality.
Findings
Achieves polynomial time complexity $O(NM)$ for $K \\geq N$
Provides bounded guarantees for Gini index and entropy impurity functions
Greedy-merge heuristic performs comparably without guarantees
Abstract
Partitioning algorithms play a key role in many scientific and engineering disciplines. A partitioning algorithm divides a set into a number of disjoint subsets or partitions. Often, the quality of the resulted partitions is measured by the amount of impurity in each partition, the smaller impurity the higher quality of the partitions. In general, for a given impurity measure specified by a function of the partitions, finding the minimum impurity partitions is an NP-hard problem. Let be the number of -dimensional elements in a set and be the number of desired partitions, then an exhaustive search over all the possible partitions to find a minimum partition has the complexity of which quickly becomes impractical for many applications with modest values of and . Thus, many approximate algorithms with polynomial time complexity have been proposed, but few provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
