Flexibly Mining Better Subgroups
Hoang-Vu Nguyen, Jilles Vreeken

TL;DR
FLEXI introduces an optimal binning approach for subgroup discovery that significantly improves subgroup quality for numeric and ordinal attributes, outperforming existing methods.
Contribution
The paper presents FLEXI, a novel method that uses optimal binning tailored for subgroup discovery, enhancing quality and efficiency over prior binning strategies.
Findings
FLEXI outperforms state-of-the-art methods with up to 25 times better subgroup quality.
Experiments on synthetic and real-world data validate FLEXI's effectiveness.
FLEXI is adaptable with various quality measures and efficient in computation.
Abstract
In subgroup discovery, also known as supervised pattern mining, discovering high quality one-dimensional subgroups and refinements of these is a crucial task. For nominal attributes, this is relatively straightforward, as we can consider individual attribute values as binary features. For numerical attributes, the task is more challenging as individual numeric values are not reliable statistics. Instead, we can consider combinations of adjacent values, i.e. bins. Existing binning strategies, however, are not tailored for subgroup discovery. That is, they do not directly optimize for the quality of subgroups, therewith potentially degrading the mining result. To address this issue, we propose FLEXI. In short, with FLEXI we propose to use optimal binning to find high quality binary features for both numeric and ordinal attributes. We instantiate FLEXI with various quality measures and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
