Reproducibility Beyond the Research Community: Experience from NLP Beginners
Shane Storks, Keunwoo Peter Yu, Joyce Chai

TL;DR
This study investigates the challenges faced by NLP beginners in reproducing recent research results, highlighting the importance of accessibility efforts like documentation and data access over technical skill.
Contribution
It provides empirical evidence that accessibility efforts by authors are crucial for reproducibility among NLP beginners, beyond just technical skill levels.
Findings
Accessibility efforts are key to successful reproduction.
Technical skill has limited impact on effort required.
Thorough documentation improves reproducibility experience.
Abstract
As NLP research attracts public attention and excitement, it becomes increasingly important for it to be accessible to a broad audience. As the research community works to democratize NLP, it remains unclear whether beginners to the field can easily apply the latest developments. To understand their needs, we conducted a study with 93 students in an introductory NLP course, where students reproduced results of recent NLP papers. Surprisingly, our results suggest that their technical skill (i.e., programming experience) has limited impact on their effort spent completing the exercise. Instead, we find accessibility efforts by research authors to be key to a successful experience, including thorough documentation and easy access to required models and datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
