TL;DR
This paper introduces a novel predictive modeling framework that assesses scientific reproducibility by analyzing artifacts and linguistic features, offering a structured, author-centric approach to streamline reproducibility evaluation.
Contribution
It presents a dual-spectrum, model-based framework for reproducibility assessment, integrating linguistic analysis and artifact evaluation, advancing beyond traditional methods.
Findings
Linguistic features like readability correlate with high reproducibility.
The framework effectively distinguishes reproducible papers from non-reproducible ones.
Artifact availability alone is insufficient for assessing reproducibility.
Abstract
The reproducibility of scientific articles is central to the advancement of science. Despite this importance, evaluating reproducibility remains challenging due to the scarcity of ground truth data. Predictive models can address this limitation by streamlining the tedious evaluation process. Typically, a paper's reproducibility is inferred based on the availability of artifacts such as code, data, or supplemental information, often without extensive empirical investigation. To address these issues, we utilized artifacts of papers as fundamental units to develop a novel, dual-spectrum framework that focuses on author-centric and external-agent perspectives. We used the author-centric spectrum, followed by the external-agent spectrum, to guide a structured, model-based approach to quantify and assess reproducibility. We explored the interdependencies between different factors influencing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
