Re-opening open-source science through AI assisted development
Ling-Hong Hung, Ka Yee Yeung

TL;DR
This paper demonstrates how AI can assist a single human in modifying large open-source scientific software, exemplified by STAR-Flex, to enhance accessibility and re-open science for community review.
Contribution
It introduces a novel AI-assisted development approach that enables rapid modification of complex open-source scientific codebases, exemplified by the creation of STAR-Flex.
Findings
AI enabled the addition of 16,000 lines of code to STAR
STAR-Flex is the first open-source software for Flex data processing
The approach facilitates community review and vetting of AI-generated code
Abstract
Open-source scientific software is effectively closed to modification by its complexity. With recent advances in technology, an agentic AI team led by a single human can now rapidly and robustly modify large codebases and re-open science to the community which can review and vet the AI generated code. We demonstrate this with a case study, STAR-Flex, which is an open source fork of STAR, adding 16,000 lines of C++ code to add the ability to process 10x Flex data, while maintaining full original function. This is the first open-source processing software for Flex data and was written as part of the NIH funded MorPHiC consortium.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Machine Learning in Materials Science · Computational Physics and Python Applications
