A Unified HI Rotation Curve Corpus for Computational Astrophysics: 438 Galaxies from SPARC, THINGS, LITTLE THINGS, and WALLABY DR2
David C. Flynn

TL;DR
This paper introduces a comprehensive, structured dataset of 8,963 HI galaxy rotation curves from four major surveys, designed for advanced analysis and retrieval using traditional methods and AI models.
Contribution
It provides a unified, verified, and annotated corpus of galaxy rotation data in a machine-readable format, facilitating diverse astrophysical analyses and AI applications.
Findings
The dataset includes 8,963 rotation curve measurements across 423 galaxies.
A two-tier quality system distinguishes curated from automated data.
Three example analyses demonstrate the dataset's utility in Python.
Abstract
We present a unified corpus of 8,963 spatially resolved HI rotation curve measurements across 423 galaxies (438 total catalog entries including 15 metadata-only THINGS galaxies), drawn from four major surveys: SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203). The corpus is distributed as a single structured JSON file with nested per-ring kinematic data, survey metadata, column definitions, and data-quality annotations, accompanied by a 438-row flat CSV for catalog-level filtering. All radii are in kiloparsecs, all velocities in km/s. Kinematic parameters have been verified against scanned primary tables. A two-tier quality system distinguishes hand-curated rotation curves with per-point uncertainties (Tier 1) from automated pipeline products (Tier 2). The corpus was designed for both traditional numerical analysis and Large Language Model retrieval-augmented…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
