Applying Vision Transformers on Spectral Analysis of Astronomical Objects
Luis Felipe Strano Moraes, Ignacio Becker, Pavlos Protopapas, and Guillermo Cabrera-Vives

TL;DR
This paper demonstrates that pre-trained Vision Transformers can effectively analyze astronomical spectral data by converting spectra into images, achieving high accuracy in classification and redshift estimation.
Contribution
It introduces a novel approach of applying pre-trained Vision Transformers to spectral analysis by transforming spectra into images, showing superior performance over traditional methods.
Findings
Achieved higher classification accuracy than SVMs and Random Forests.
Attained R^2 values comparable to specialized spectrum encoders.
Validated effectiveness on large-scale real spectroscopic data.
Abstract
We apply pre-trained Vision Transformers (ViTs), originally developed for image recognition, to the analysis of astronomical spectral data. By converting traditional one-dimensional spectra into two-dimensional image representations, we enable ViTs to capture both local and global spectral features through spatial self-attention. We fine-tune a ViT pretrained on ImageNet using millions of spectra from the SDSS and LAMOST surveys, represented as spectral plots. Our model is evaluated on key tasks including stellar object classification and redshift () estimation, where it demonstrates strong performance and scalability. We achieve classification accuracy higher than Support Vector Machines and Random Forests, and attain values comparable to AstroCLIP's spectrum encoder, even when generalizing across diverse object types. These results demonstrate the effectiveness of using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
