Fast-VAT: Accelerating Cluster Tendency Visualization using Cython and Numba
MSR Avinash (Presidency University, Bangalore), Ismael Lachheb (EPITA School of Engineering, Computer Science, Paris, France)

TL;DR
Fast-VAT significantly accelerates the VAT clustering tendency visualization method in Python by leveraging Cython and Numba, achieving up to 50x speedup while maintaining accuracy, validated on multiple datasets and compared with clustering algorithms.
Contribution
The paper introduces Fast-VAT, a high-performance Python implementation of VAT that uses JIT compilation and static typing for substantial speed improvements.
Findings
Achieves up to 50x speedup over baseline VAT implementation.
Preserves output fidelity while significantly improving performance.
Validated on real and synthetic datasets with consistent results.
Abstract
Visual Assessment of Cluster Tendency (VAT) is a widely used unsupervised technique to assess the presence of cluster structure in unlabeled datasets. However, its standard implementation suffers from significant performance limitations due to its O(n^2) time complexity and inefficient memory usage. In this work, we present Fast-VAT, a high-performance reimplementation of the VAT algorithm in Python, augmented with Numba's Just-In-Time (JIT) compilation and Cython's static typing and low-level memory optimizations. Our approach achieves up to 50x speedup over the baseline implementation, while preserving the output fidelity of the original method. We validate Fast-VAT on a suite of real and synthetic datasets -- including Iris, Mall Customers, and Spotify subsets -- and verify cluster tendency using Hopkins statistics, PCA, and t-SNE. Additionally, we compare VAT's structural insights…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics
