ReproHum #0087-01: Human Evaluation Reproduction Report for Generating Fact Checking Explanations
Tyler Loakman, Chenghua Lin

TL;DR
This study reproduces and reanalyzes human evaluation results for fact-checking explanations, confirming the original findings and assessing reproducibility in NLP research.
Contribution
It provides a partial reproduction of a fact-checking explanation generation study, evaluating human evaluation reproducibility and confirming original conclusions.
Findings
Reproduction supports original findings on model efficacy.
Similar patterns observed between original and reproduced results.
Slight variations noted but main conclusions remain valid.
Abstract
This paper presents a partial reproduction of Generating Fact Checking Explanations by Anatanasova et al (2020) as part of the ReproHum element of the ReproNLP shared task to reproduce the findings of NLP research regarding human evaluation. This shared task aims to investigate the extent to which NLP as a field is becoming more or less reproducible over time. Following the instructions provided by the task organisers and the original authors, we collect relative rankings of 3 fact-checking explanations (comprising a gold standard and the outputs of 2 models) for 40 inputs on the criteria of Coverage. The results of our reproduction and reanalysis of the original work's raw results lend support to the original findings, with similar patterns seen between the original work and our reproduction. Whilst we observe slight variation from the original results, our findings support the main…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Data Quality and Management · Topic Modeling
