Iterative Document-level Information Extraction via Imitation Learning
Yunmo Chen, William Gantt, Weiwei Gu, Tongfei Chen, Aaron Steven, White, Benjamin Van Durme

TL;DR
This paper introduces IterX, an imitation learning-based iterative model for document-level relation extraction that effectively identifies and extracts complex templates without predefined order, achieving state-of-the-art results.
Contribution
The paper proposes a novel imitation learning approach for iterative document-level extraction, removing the need for predefined template sequences and improving extraction performance.
Findings
Achieves state-of-the-art on SciREX and MUC-4 benchmarks.
Effective in extracting complex relation templates.
Performs well on a new granular extraction task.
Abstract
We present a novel iterative extraction model, IterX, for extracting complex relations, or templates (i.e., N-tuples representing a mapping from named slots to spans of text) within a document. Documents may feature zero or more instances of a template of any given type, and the task of template extraction entails identifying the templates in a document and extracting each template's slot values. Our imitation learning approach casts the problem as a Markov decision process (MDP), and relieves the need to use predefined template orders to train an extractor. It leads to state-of-the-art results on two established benchmarks -- 4-ary relation extraction on SciREX and template extraction on MUC-4 -- as well as a strong baseline on the new BETTER Granular task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
