TL;DR
This paper provides a comprehensive review of current research efforts, challenges, and debates surrounding reproducibility in NLP, highlighting the diversity of perspectives and the need for consensus.
Contribution
It offers a systematic overview of NLP reproducibility research, identifying key themes, differences, and commonalities in the field.
Findings
Diverse definitions and measures of reproducibility in NLP.
Growing initiatives and interest in addressing reproducibility issues.
Lack of consensus on standard practices for reproducibility.
Abstract
Against the background of what has been termed a reproducibility crisis in science, the NLP field is becoming increasingly interested in, and conscientious about, the reproducibility of its results. The past few years have seen an impressive range of new initiatives, events and active research in the area. However, the field is far from reaching a consensus about how reproducibility should be defined, measured and addressed, with diversity of views currently increasing rather than converging. With this focused contribution, we aim to provide a wide-angle, and as near as possible complete, snapshot of current work on reproducibility in NLP, delineating differences and similarities, and providing pointers to common denominators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
