What did you Mention? A Large Scale Mention Detection Benchmark for   Spoken and Written Text

Yosi Mass; Lili Kotlerman; Shachar Mirkin; Elad Venezian; Gera; Witzling; Noam Slonim

arXiv:1801.07507·cs.CL·January 26, 2018·5 cites

What did you Mention? A Large Scale Mention Detection Benchmark for Spoken and Written Text

Yosi Mass, Lili Kotlerman, Shachar Mirkin, Elad Venezian, Gera, Witzling, Noam Slonim

PDF

Open Access

TL;DR

This paper introduces a comprehensive benchmark for Mention Detection across spoken and written texts, enabling evaluation of tools on diverse, high-quality annotated data.

Contribution

It presents a large-scale, high-quality mention detection benchmark with annotations for various entity types across different text modalities, built through a controlled crowdsourcing process.

Findings

01

State-of-the-art system evaluated on the benchmark

02

Benchmark covers both written and spoken text types

03

Ensures high annotation quality through controlled crowdsourcing

Abstract

We describe a large, high-quality benchmark for the evaluation of Mention Detection tools. The benchmark contains annotations of both named entities as well as other types of entities, annotated on different types of text, ranging from clean text taken from Wikipedia, to noisy spoken data. The benchmark was built through a highly controlled crowd sourcing process to ensure its quality. We describe the benchmark, the process and the guidelines that were used to build it. We then demonstrate the results of a state-of-the-art system running on that benchmark.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems