Assessing Reproducibility in Evolutionary Computation: A Case Study using Human- and LLM-based Assessment

Francesca Da Ros; Tarik Za\v{c}iragi\'c; Aske Plaat; Thomas B\"ack; Niki van Stein

arXiv:2602.07059·cs.NE·February 10, 2026

Assessing Reproducibility in Evolutionary Computation: A Case Study using Human- and LLM-based Assessment

Francesca Da Ros, Tarik Za\v{c}iragi\'c, Aske Plaat, Thomas B\"ack, Niki van Stein

PDF

Open Access

TL;DR

This study evaluates reproducibility in evolutionary computation research over a decade, introducing a checklist and an LLM-based system to automate reproducibility assessment, revealing significant gaps and the potential of automation.

Contribution

The paper presents a structured reproducibility checklist and RECAP, an LLM-based system for automated reproducibility evaluation in evolutionary computation research.

Findings

01

Average reproducibility score of 0.62 among papers.

02

36.90% of papers provide additional materials.

03

RECAP achieves Cohen's k of 0.67 with human evaluators.

Abstract

Reproducibility is an important requirement in evolutionary computation, where results largely depend on computational experiments. In practice, reproducibility relies on how algorithms, experimental protocols, and artifacts are documented and shared. Despite growing awareness, there is still limited empirical evidence on the actual reproducibility levels of published work in the field. In this paper, we study the reproducibility practices in papers published in the Evolutionary Combinatorial Optimization and Metaheuristics track of the Genetic and Evolutionary Computation Conference over a ten-year period. We introduce a structured reproducibility checklist and apply it through a systematic manual assessment of the selected corpus. In addition, we propose RECAP (REproducibility Checklist Automation Pipeline), an LLM-based system that automatically evaluates reproducibility signals from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Machine Learning in Materials Science · Software Engineering Research