What did you Mention? A Large Scale Mention Detection Benchmark for Spoken and Written Text
Yosi Mass, Lili Kotlerman, Shachar Mirkin, Elad Venezian, Gera, Witzling, Noam Slonim

TL;DR
This paper introduces a comprehensive benchmark for Mention Detection across spoken and written texts, enabling evaluation of tools on diverse, high-quality annotated data.
Contribution
It presents a large-scale, high-quality mention detection benchmark with annotations for various entity types across different text modalities, built through a controlled crowdsourcing process.
Findings
State-of-the-art system evaluated on the benchmark
Benchmark covers both written and spoken text types
Ensures high annotation quality through controlled crowdsourcing
Abstract
We describe a large, high-quality benchmark for the evaluation of Mention Detection tools. The benchmark contains annotations of both named entities as well as other types of entities, annotated on different types of text, ranging from clean text taken from Wikipedia, to noisy spoken data. The benchmark was built through a highly controlled crowd sourcing process to ensure its quality. We describe the benchmark, the process and the guidelines that were used to build it. We then demonstrate the results of a state-of-the-art system running on that benchmark.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
