A Data-Informed Variational Clustering Framework for Noisy High-Dimensional Data
Wan Ping Chen

TL;DR
DIVI is a practical variational clustering framework designed for noisy high-dimensional data, combining feature relevance learning and adaptive structure growth to improve stability and interpretability.
Contribution
It introduces a data-informed variational approach with feature gating and adaptive structure expansion, addressing challenges of noise and unknown cluster number.
Findings
Performs well under severe feature noise
Maintains computational feasibility
Provides interpretable feature relevance behavior
Abstract
Clustering in high-dimensional settings with severe feature noise remains challenging, especially when only a small subset of dimensions is informative and the final number of clusters is not specified in advance. In such regimes, partition recovery, feature relevance learning, and structural adaptation are tightly coupled, and standard likelihood-based methods can become unstable or overly sensitive to noisy dimensions. We propose DIVI, a data-informed variational clustering framework that combines global feature gating with split-based adaptive structure growth. DIVI uses informative prior initialization to stabilize optimization, learns feature relevance in a differentiable manner, and expands model complexity only when local diagnostics indicate underfit. Beyond clustering performance, we also examine runtime scalability and parameter sensitivity in order to clarify the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
