DocOIE: A Document-level Context-Aware Dataset for OpenIE
Kuicai Dong, Yilin Zhao, Aixin Sun, Jung-Jae Kim, Xiaoli Li

TL;DR
This paper introduces DocOIE, a new dataset and model for document-level context-aware OpenIE, demonstrating that incorporating document context improves extraction accuracy in NLP applications.
Contribution
The paper presents the first document-level context-aware OpenIE dataset and a novel model, DocIE, showing improved performance over sentence-level methods.
Findings
Incorporating document context enhances OpenIE accuracy.
The DocOIE dataset covers healthcare and transportation domains.
DocIE outperforms sentence-level OpenIE models.
Abstract
Open Information Extraction (OpenIE) aims to extract structured relational tuples (subject, relation, object) from sentences and plays critical roles for many downstream NLP applications. Existing solutions perform extraction at sentence level, without referring to any additional contextual information. In reality, however, a sentence typically exists as part of a document rather than standalone; we often need to access relevant contextual information around the sentence before we can accurately interpret it. As there is no document-level context-aware OpenIE dataset available, we manually annotate 800 sentences from 80 documents in two domains (Healthcare and Transportation) to form a DocOIE dataset for evaluation. In addition, we propose DocIE, a novel document-level context-aware OpenIE model. Our experimental results based on DocIE demonstrate that incorporating document-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
