An Empirical Study on Finding Spans

Weiwei Gu; Boyuan Zheng; Yunmo Chen; Tongfei Chen; Benjamin Van Durme

arXiv:2210.06824·cs.CL·October 17, 2022

An Empirical Study on Finding Spans

Weiwei Gu, Boyuan Zheng, Yunmo Chen, Tongfei Chen, Benjamin Van Durme

PDF

Open Access

TL;DR

This paper empirically compares different span finding methods for information extraction, revealing that no single approach is best universally and offering insights on how task properties influence method performance.

Contribution

It provides a comprehensive empirical analysis of span finding techniques, highlighting the impact of task-specific factors on their effectiveness.

Findings

01

Tagging approaches yield higher precision

02

Span enumeration and boundary prediction offer higher recall

03

Span type information benefits boundary prediction

Abstract

We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks. We focus on approaches that can be employed in training end-to-end information extraction systems, and find there is no definitive solution without considering task properties, and provide our observations to help with future design choices: 1) a tagging approach often yields higher precision while span enumeration and boundary prediction provide higher recall; 2) span type information can benefit a boundary prediction approach; 3) additional contextualization does not help span finding in most cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Data Mining and Analysis · Advanced Text Analysis Techniques · Natural Language Processing Techniques