A Re-analysis of Repeatability and Reproducibility in the Ames-USDOE-FBI Study
Alan H. Dorfman, Richard Valliant

TL;DR
This paper critically re-analyzes a key study on forensic firearms examination, revealing that its conclusions about repeatability and reproducibility are misleading due to misinterpretation of statistical measures.
Contribution
It clarifies the proper interpretation of agreement metrics in forensic studies, challenging previous claims of satisfactory reliability.
Findings
Observed agreement does not necessarily indicate high reliability.
Proper use of expected agreement shows lower repeatability and reproducibility.
The study questions the validity of prior conclusions on forensic firearm exam reliability.
Abstract
Forensic firearms identification, the determination by a trained firearms examiner as to whether or not bullets or cartridges came from a common weapon, has long been a mainstay in the criminal courts. Reliability of forensic firearms identification has been challenged in the general scientific community, and, in response, several studies have been carried out aimed at showing that firearms examination is accurate, that is, has low error rates. Less studied has been the question of consistency, of. whether two examinations of the same bullets or cartridge cases come to the same conclusion, carried out by an examiner on separate occasions -- intrarater reliability or repeatability -- or by two examiners -- interrater reliability or reproducibility. One important study, described in a 2020 Report by the Ames Laboratory-USDOE to the Federal Bureau of Investigation, went beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForensic and Genetic Research
