Clustering large number of extragalactic spectra of galaxies and quasars   through canopies

Tuli De; Didier Fraix-Burnet (IPAG); Asis Kumar Chattopadhyay

arXiv:1309.3729·astro-ph.CO·September 17, 2013·2 cites

Clustering large number of extragalactic spectra of galaxies and quasars through canopies

Tuli De, Didier Fraix-Burnet (IPAG), Asis Kumar Chattopadhyay

PDF

Open Access

TL;DR

This paper introduces a new clustering method tailored for large, high-dimensional datasets, successfully applied to over 700,000 galaxy and quasar spectra to identify five distinct groups.

Contribution

A novel clustering technique designed for large, high-dimensional data sets, demonstrated on extensive astronomical spectral data.

Findings

01

Successfully clustered 702,248 spectra into five groups

02

Effective handling of high-dimensional, large-scale data

03

Demonstrated applicability to astronomical datasets

Abstract

Cluster analysis is the distribution of objects into different groups or more precisely the partitioning of a data set into subsets (clusters) so that the data in subsets share some common trait according to some distance measure. Unlike classi cation, in clustering one has to rst decide the optimum number of clusters and then assign the objects into different clusters. Solution of such problems for a large number of high dimensional data points is quite complicated and most of the existing algorithms will not perform properly. In the present work a new clustering technique applicable to large data set has been used to cluster the spectra of 702248 galaxies and quasars having 1540 points in wavelength range imposed by the instrument. The proposed technique has successfully discovered ve clusters from this 702248X1540 data matrix.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBlind Source Separation Techniques · Spectroscopy and Chemometric Analyses · Fractal and DNA sequence analysis