Scaling Laws for Galaxy Images
Mike Walmsley, Micah Bowles, Anna M.M. Scaife, Jason Shingirai, Makechemu, Alexander J. Gordon, Annette M.N. Ferguson, Robert G. Mann, James, Pearson, J\"urgen J. Popp, Jo Bovy, Josh Speagle, Hugh Dickinson, Lucy, Fortson, Tobias G\'eron, Sandor Kruk, Chris J. Lintott

TL;DR
This paper investigates how scaling laws apply to galaxy images, demonstrating that adding domain-specific data improves model performance and efficiency more than increasing model size alone.
Contribution
It provides the first systematic analysis of scaling laws for galaxy images, showing the benefits of in-domain pretraining and data augmentation over model scaling.
Findings
Adding galaxy images improves performance across architectures and tasks.
Pretraining on galaxy images reduces error rates by 31% on downstream tasks.
Scaling model size offers modest additional benefits, highlighting the importance of domain-specific adaptation.
Abstract
We present the first systematic investigation of supervised scaling laws outside of an ImageNet-like context - on images of galaxies. We use 840k galaxy images and over 100M annotations by Galaxy Zoo volunteers, comparable in scale to Imagenet-1K. We find that adding annotated galaxy images provides a power law improvement in performance across all architectures and all tasks, while adding trainable parameters is effective only for some (typically more subjectively challenging) tasks. We then compare the downstream performance of finetuned models pretrained on either ImageNet-12k alone vs. additionally pretrained on our galaxy images. We achieve an average relative error rate reduction of 31% across 5 downstream tasks of scientific interest. Our finetuned models are more label-efficient and, unlike their ImageNet-12k-pretrained equivalents, often achieve linear transfer performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques
