Vector Quantized Spectral Clustering applied to Soybean Whole Genome Sequences
Aditya A. Shastri, Kapil Ahuja, Milind B. Ratnaparkhe, Aditya Shah,, Aishwary Gagrani, and Anant Lal

TL;DR
This paper introduces a novel Vector Quantized Spectral Clustering method tailored for soybean genome data, achieving higher accuracy and faster computation compared to traditional clustering techniques.
Contribution
It combines spectral clustering with vector quantization, using a new similarity matrix and k-medoids, optimized specifically for soybean genome sequences.
Findings
Outperforms UPGMA and NJ in cluster quality by up to 25%
Significantly reduces computational time
Demonstrates effectiveness on soybean genome data
Abstract
We develop a Vector Quantized Spectral Clustering (VQSC) algorithm that is a combination of Spectral Clustering (SC) and Vector Quantization (VQ) sampling for grouping Soybean genomes. The inspiration here is to use SC for its accuracy and VQ to make the algorithm computationally cheap (the complexity of SC is cubic in-terms of the input size). Although the combination of SC and VQ is not new, the novelty of our work is in developing the crucial similarity matrix in SC as well as use of k-medoids in VQ, both adapted for the Soybean genome data. We compare our approach with commonly used techniques like UPGMA (Un-weighted Pair Graph Method with Arithmetic Mean) and NJ (Neighbour Joining). Experimental results show that our approach outperforms both these techniques significantly in terms of cluster quality (up to 25% better cluster quality) and time complexity (order of magnitude faster).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoybean genetics and cultivation · Genetic Mapping and Diversity in Plants and Animals · GABA and Rice Research
MethodsSpectral Clustering
