Probabilistic Coreference in Information Extraction

Andrew Kehler (SRI International)

arXiv:cmp-lg/9706012·cmp-lg·February 3, 2008·52 cites

Probabilistic Coreference in Information Extraction

Andrew Kehler (SRI International)

PDF

Open Access

TL;DR

This paper explores methods to assign probabilistic distributions to coreference relationships in information extraction, enabling better integration of conflicting data from multiple sources.

Contribution

It introduces approaches for estimating probability distributions over coreference sets within an information extraction framework, addressing a key need for probabilistic outputs.

Findings

01

Initial experiments demonstrate feasibility of probabilistic coreference assignment

02

Approaches improve integration of conflicting information

03

Framework supports downstream fusion of extracted data

Abstract

Certain applications require that the output of an information extraction system be probabilistic, so that a downstream system can reliably fuse the output with possibly contradictory information from other sources. In this paper we consider the problem of assigning a probability distribution to alternative sets of coreference relationships among entity descriptions. We present the results of initial experiments with several approaches to estimating such distributions in an application using SRI's FASTUS information extraction system.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Advanced Database Systems and Queries