Clustering large number of extragalactic spectra of galaxies and quasars through canopies
Tuli De, Didier Fraix-Burnet (IPAG), Asis Kumar Chattopadhyay

TL;DR
This paper introduces a new clustering method tailored for large, high-dimensional datasets, successfully applied to over 700,000 galaxy and quasar spectra to identify five distinct groups.
Contribution
A novel clustering technique designed for large, high-dimensional data sets, demonstrated on extensive astronomical spectral data.
Findings
Successfully clustered 702,248 spectra into five groups
Effective handling of high-dimensional, large-scale data
Demonstrated applicability to astronomical datasets
Abstract
Cluster analysis is the distribution of objects into different groups or more precisely the partitioning of a data set into subsets (clusters) so that the data in subsets share some common trait according to some distance measure. Unlike classi cation, in clustering one has to rst decide the optimum number of clusters and then assign the objects into different clusters. Solution of such problems for a large number of high dimensional data points is quite complicated and most of the existing algorithms will not perform properly. In the present work a new clustering technique applicable to large data set has been used to cluster the spectra of 702248 galaxies and quasars having 1540 points in wavelength range imposed by the instrument. The proposed technique has successfully discovered ve clusters from this 702248X1540 data matrix.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlind Source Separation Techniques · Spectroscopy and Chemometric Analyses · Fractal and DNA sequence analysis
