An Empirical Study on Finding Spans
Weiwei Gu, Boyuan Zheng, Yunmo Chen, Tongfei Chen, Benjamin Van Durme

TL;DR
This paper empirically compares different span finding methods for information extraction, revealing that no single approach is best universally and offering insights on how task properties influence method performance.
Contribution
It provides a comprehensive empirical analysis of span finding techniques, highlighting the impact of task-specific factors on their effectiveness.
Findings
Tagging approaches yield higher precision
Span enumeration and boundary prediction offer higher recall
Span type information benefits boundary prediction
Abstract
We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks. We focus on approaches that can be employed in training end-to-end information extraction systems, and find there is no definitive solution without considering task properties, and provide our observations to help with future design choices: 1) a tagging approach often yields higher precision while span enumeration and boundary prediction provide higher recall; 2) span type information can benefit a boundary prediction approach; 3) additional contextualization does not help span finding in most cases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Advanced Text Analysis Techniques · Natural Language Processing Techniques
