AI Cosplaying as Astrophysicists: A Controlled Synthetic-Agent Study of AI-Assisted Astrophysical Research Workflows
Chun Huang

TL;DR
This study systematically evaluates how different AI assistance styles impact astrophysical research workflows using synthetic researchers, revealing that AI's usefulness is task-dependent and influenced by usage policies.
Contribution
It introduces a controlled simulation framework to assess AI assistance in astrophysics, highlighting the conditional and task-specific nature of AI benefits.
Findings
Cautious AI assistance improves creative and critique tasks but can fail on physics derivations.
Verification-heavy AI use becomes most effective after actor-swap rerun.
AI assistance's utility varies with task, workflow, and AI model used.
Abstract
Large Language Models (LLMs) are now widely used in astrophysics, but do they actually make our lives easier, or do they merely invent new physics with enough confidence to hide a minus sign? In a specialized field where checking fluent hallucinations is itself labor-intensive, AI assistance can demand as much work as the task it claims to simplify. To evaluate where AI genuinely improves scientific workflows, we bypassed human trials and instead forced AI agents to cosplay as astrophysicists. We simulated 144 synthetic researchers, varying in career stage, AI awareness, and willingness to verify outputs, across 2,592 daily astrophysics research assignments. Comparing solo work against four styles of AI assistance produced 12,960 scored episodes. No assisted policy universally outperformed unassisted work in the primary Qwen production run. Instead, performance depends strongly on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
