Do Papers Tell the Whole Story? A Benchmark and Framework for Uncovering Hidden Implementation Gaps in Bioinformatics
Tianxiang Xu, Xiaoyan Zhu, Xin Lai, Sizhe Dang, Xin Lian, Hangyu Cheng, Jiayin Wang

TL;DR
This paper introduces BioCon, a benchmark dataset and a framework for detecting inconsistencies between bioinformatics papers and their code, aiming to improve reproducibility and reliability in scientific research.
Contribution
It presents the first dataset and unified framework for paper-code consistency detection in bioinformatics, enabling systematic analysis of semantic alignment.
Findings
Proposed a high-quality sentence-code paired dataset BioCon.
Achieved strong performance in consistency discrimination and semantic alignment.
Established a new research direction for reproducibility in bioinformatics.
Abstract
Ensuring consistency between research papers and their corresponding software code implementations is a fundamental prerequisite for guaranteeing the reproducibility of scientific findings and the reliability of software systems. However, this issue has received limited attention to date, particularly in the field of bioinformatics, where inconsistencies between methodological descriptions in papers and their actual code implementations are prevalent. To address this gap, we introduce a novel research task, namely paper-code consistency detection, which aims to characterize the cross-modal semantic alignment between methodological descriptions in papers and their corresponding code implementations. At the data level, we construct the first benchmark dataset for this task in the bioinformatics domain, termed BioCon, comprising 48 bioinformatics software projects and their associated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
