An updated efficient galaxy morphology classification model based on ConvNeXt encoding with UMAP dimensionality reduction
Guanwen Fang, Shiwei Zhu, Jun Xu, Shiying Lu, Chichun Zhou, Yao Dai, Zesen Lin, Xu Kong

TL;DR
This paper introduces an improved unsupervised galaxy classification model combining ConvNeXt CNN feature extraction with UMAP for dimensionality reduction, enabling efficient large-scale morphological analysis aligned with galaxy evolution theories.
Contribution
The paper presents a novel dual-stage UML framework that enhances classification efficiency and accuracy for large galaxy datasets using transfer learning and topology-preserving reduction.
Findings
Classified 51% of galaxies into five morphology types
Reduced cluster number from 50 to 20 for computational efficiency
Results align well with galaxy evolution theory
Abstract
We present an enhanced unsupervised machine learning (UML) module within our previous \texttt{USmorph} classification framework featuring two components: (1) hierarchical feature extraction via a pre-trained ConvNeXt convolutional neural network (CNN) with transfer learning, and (2) nonlinear manifold learning using Uniform Manifold Approximation and Projection (UMAP) for topology-aware dimensionality reduction. This dual-stage design enables efficient knowledge transfer from large-scale visual datasets while preserving morphological pattern geometry through UMAP's neighborhood preservation. We apply the upgraded UML on I-band images of 99,806 COSMOS galaxies at redshift (to ensure rest-frame optical morphology) with . The predefined cluster number is optimized to 20 (reduced from 50 in the original framework), achieving significant computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Galaxies: Formation, Evolution, Phenomena · Machine Learning and Algorithms
