TL;DR
This paper introduces an automated error analysis framework for document-level information extraction, enabling detailed insights into model errors beyond traditional metrics, and compares recent approaches across multiple datasets and historical systems.
Contribution
It proposes a transformation-based framework for automating error analysis in document-level IE, facilitating detailed error insights and comparison of state-of-the-art methods.
Findings
The framework reveals detailed error patterns in document-level IE models.
Comparison shows progress in IE performance over 30 years.
Insights into error types help guide future improvements.
Abstract
Document-level information extraction (IE) tasks have recently begun to be revisited in earnest using the end-to-end neural network techniques that have been successful on their sentence-level IE counterparts. Evaluation of the approaches, however, has been limited in a number of dimensions. In particular, the precision/recall/F1 scores typically reported provide few insights on the range of errors the models make. We build on the work of Kummerfeld and Klein (2013) to propose a transformation-based framework for automating error analysis in document-level event and (N-ary) relation extraction. We employ our framework to compare two state-of-the-art document-level template-filling approaches on datasets from three domains; and then, to gauge progress in IE since its inception 30 years ago, vs. four systems from the MUC-4 (1992) evaluation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
