STRADAViT: Towards a Foundational Model for Radio Astronomy through Self-Supervised Transfer
Andrea DeMarco, Ian Fenech Conti, Hayley Camilleri, Ardiana Bushi, Simone Riggi

TL;DR
STRADAViT introduces a self-supervised Vision Transformer framework tailored for radio astronomy imagery, enhancing morphology analysis across diverse surveys through transfer learning.
Contribution
It presents a novel staged pretraining approach combining survey-aware data and ViT-MAE initialization for improved transfer performance in radio astronomy.
Findings
Best two-stage models improve Macro-F1 across benchmarks.
ViT-MAE-based checkpoint offers competitive transfer with lower downstream costs.
Radio-aware view generation enhances domain adaptation.
Abstract
Next-generation radio astronomy surveys are delivering millions of resolved sources, but robust and scalable morphology analysis remains difficult across heterogeneous telescopes and imaging pipelines. We present STRADAViT, a self-supervised Vision Transformer continued-pretraining framework for learning transferable encoders from radio astronomy imagery. The framework combines mixed-survey data curation, radio astronomy-aware training-view generation, and a ViT-MAE-initialized encoder family with optional register tokens. It supports reconstruction-only, contrastive-only, and two-stage branches. Our pretraining dataset comprises radio astronomy cutouts drawn from four complementary sources. We evaluate transfer with linear probing and fine-tuning on three morphology benchmarks spanning binary and multi-class settings. Relative to the ViT-MAE initialization used for continued…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
