AdaBox: Adaptive Density-Based Box Clustering with Parameter Generalization
Ahmed Elmahdi

TL;DR
AdaBox is a robust, grid-based density clustering algorithm that generalizes parameters across diverse datasets, outperforming existing methods like DBSCAN and HDBSCAN in accuracy and transferability.
Contribution
Introduces AdaBox, a novel density-based clustering method with parameter design for robustness and transferability across varying data scales and geometries.
Findings
AdaBox outperforms DBSCAN and HDBSCAN on 78% of datasets (p < 0.05).
AdaBox maintains performance when scaling datasets 30-100x.
All five architectural stages are essential for robustness.
Abstract
Density-based clustering algorithms like DBSCAN and HDBSCAN are foundational tools for discovering arbitrarily shaped clusters, yet their practical utility is undermined by acute hyperparameter sensitivity -- parameters tuned on one dataset frequently fail to transfer to others, requiring expensive re-optimization for each deployment. We introduce AdaBox (Adaptive Density-Based Box Clustering), a grid-based density clustering algorithm designed for robustness across diverse data geometries. AdaBox features a six-parameter design where parameters capture cluster structure rather than pairwise point relationships. Four parameters are inherently scale-invariant, one self-corrects for sampling bias, and one is adjusted via a density scaling stage, enabling reliable parameter transfer across 30-200x scale factors. AdaBox processes data through five stages: adaptive grid construction,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Domain Adaptation and Few-Shot Learning
