Robustness Analysis of USmorph: II. Optimizing Feature Extraction, Dimensionality Reduction, and Clustering for Unsupervised Galaxy Morphology Classification

Guanwen Fang; Xiaolei Yin; Yirui Zheng; Zesen Lin; Shiwei Zhu; Jie Song; Chichun Zhou; Xu Kong

arXiv:2605.20871·astro-ph.GA·May 21, 2026

Robustness Analysis of USmorph: II. Optimizing Feature Extraction, Dimensionality Reduction, and Clustering for Unsupervised Galaxy Morphology Classification

Guanwen Fang, Xiaolei Yin, Yirui Zheng, Zesen Lin, Shiwei Zhu, Jie Song, Chichun Zhou, Xu Kong

PDF

TL;DR

This paper systematically improves the USmorph framework for galaxy morphology classification by optimizing feature extraction, dimensionality reduction, and clustering, demonstrating robustness and scientific validity for large-scale surveys.

Contribution

It introduces a Bagging-based multi-cluster voting scheme and evaluates multiple models and algorithms, establishing a more reliable and scalable unsupervised classification method.

Findings

01

AlexNet best for feature extraction from galaxy images.

02

UMAP effectively preserves structure in reduced dimensions.

03

Bagging voting scheme improves clustering stability and purity.

Abstract

We conduct a systematic robustness analysis of the unsupervised machine learning module within the hybrid framework \texttt{USmorph}. This module automatically discovers morphological structures from large-scale galaxy images, forming the foundation of the complete classification workflow. We evaluate five pre-trained models for feature extraction and identify an ImageNet-pretrained AlexNet as the most effective for capturing discriminative morphological features. UMAP is chosen for dimensionality reduction due to its optimal balance between preserving high-dimensional structure and computational efficiency. To enhance clustering stability, we propose a Bagging-based multi-cluster voting scheme, which significantly improves label consistency and cluster purity. We compare the convergence, scalability, and quality of five clustering algorithms, finding that the Bagging voting scheme has…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.