Hyrax: An Extensible Framework for Rapid ML Experimentation and Unsupervised Discovery in the Era of Rubin, Roman, and Euclid
Aritra Ghosh, Drew Oldag, Michael Tauraso, Andrew J. Connolly, Peter Ferguson, Derek Jones, Gourav Khullar, Argyro Sasli, Samarth Venkatesh, Gracia Wang, Maxine West, Dylan Berry, Neven Caplar, Colin Orion Chandler, Tanawan Chatchadanoraset, Michael W. Coughlin, Melissa DeLucchi

TL;DR
Hyrax is an open-source, GPU-enabled Python framework designed for comprehensive ML experimentation in astronomy, supporting data handling, unsupervised discovery, and rapid iteration across large survey datasets.
Contribution
It introduces Hyrax, a modular framework that streamlines the entire ML pipeline in astronomy, enabling new discoveries without labeled data and facilitating rapid experimentation.
Findings
Unsupervised learning identified new astronomical candidates in survey data.
Hybrid clustering effectively found gravitational lens candidates.
Multimodal classification improved transient event analysis.
Abstract
The NSF-DOE Vera C. Rubin Observatory, Roman Space Telescope, Euclid, and other next-generation surveys will deliver imaging, spectroscopic, and time-domain data at scales that increasingly shift the bottleneck in astronomical machine learning (ML) projects from model design to infrastructure. We present Hyrax, an open-source, modular, GPU-enabled Python framework that supports the full ML lifecycle in astronomy: from data acquisition and training to inference and experiment comparison, with capabilities including multimodal dataset support, integrated vector databases for similarity search, and interactive two- and three-dimensional latent-space exploration for unsupervised discovery. We demonstrate Hyrax's versatility through five representative applications on real survey data: (i) unsupervised representation learning on Rubin Legacy Survey of Space and Time (LSST)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
