MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
Zhangyu Wang, Gengchen Mai, Krzysztof Janowicz, Ni Lao

TL;DR
This paper introduces MC-GTA, a novel clustering algorithm that effectively incorporates metric autocorrelation using goodness-of-fit tests, leading to improved accuracy and stability in clustering multivariate temporal and spatial data.
Contribution
The paper presents MC-GTA, a new model-based clustering method that addresses computational instability by integrating metric autocorrelation through goodness-of-fit tests.
Findings
Outperforms baseline methods with up to 14.3% higher ARI.
Achieves 32.1% improvement in NMI.
Provides faster and more stable optimization (>10x speedup).
Abstract
A wide range of (multivariate) temporal (1D) and spatial (2D) data analysis tasks, such as grouping vehicle sensor trajectories, can be formulated as clustering with given metric constraints. Existing metric-constrained clustering algorithms overlook the rich correlation between feature similarity and metric distance, i.e., metric autocorrelation. The model-based variations of these clustering algorithms (e.g. TICC and STICC) achieve SOTA performance, yet suffer from computational instability and complexity by using a metric-constrained Expectation-Maximization procedure. In order to address these two problems, we propose a novel clustering algorithm, MC-GTA (Model-based Clustering via Goodness-of-fit Tests with Autocorrelations). Its objective is only composed of pairwise weighted sums of feature similarity terms (square Wasserstein-2 distance) and metric autocorrelation terms (a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research
