When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE   Systems for Downstream Applications

Kevin Pei (Grainger College of Engineering; University of Illinois at; Urbana-Champaign); Ishan Jindal (IBM Research); Kevin Chen-Chuan Chang; (Grainger College of Engineering; University of Illinois at; Urbana-Champaign); Chengxiang Zhai (Grainger College of Engineering,; University of Illinois at Urbana-Champaign); Yunyao Li (Apple Knowledge; Platform)

arXiv:2211.08228·cs.CL·November 16, 2022

When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE Systems for Downstream Applications

Kevin Pei (Grainger College of Engineering, University of Illinois at, Urbana-Champaign), Ishan Jindal (IBM Research), Kevin Chen-Chuan Chang, (Grainger College of Engineering, University of Illinois at, Urbana-Champaign), Chengxiang Zhai (Grainger College of Engineering,

PDF

Open Access

TL;DR

This paper provides an empirical comparison of neural OpenIE models, training sets, and benchmarks to guide users in selecting suitable systems for various NLP downstream tasks, emphasizing the importance of model assumptions.

Contribution

It offers an application-focused empirical survey that analyzes how different models and datasets impact OpenIE performance, aiding in informed system selection.

Findings

01

Model assumptions significantly affect performance

02

Training set differences influence model effectiveness

03

Recommendations improve downstream Complex QA results

Abstract

Open Information Extraction (OpenIE) has been used in the pipelines of various NLP tasks. Unfortunately, there is no clear consensus on which models to use in which tasks. Muddying things further is the lack of comparisons that take differing training sets into account. In this paper, we present an application-focused empirical survey of neural OpenIE models, training sets, and benchmarks in an effort to help users choose the most suitable OpenIE systems for their applications. We find that the different assumptions made by different models and datasets have a statistically significant effect on performance, making it important to choose the most appropriate model for one's applications. We demonstrate the applicability of our recommendations on a downstream Complex QA application.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research