Generating Information Extraction Patterns from Overlapping and Variable Length Annotations using Sequence Alignment
Frank Meng, Craig A. Morioka, Danne C. Elbers

TL;DR
This paper presents a sequence alignment approach to generate information extraction patterns that handle overlapping and variable-length annotations, improving context understanding without fixed window constraints, demonstrated on NER tasks.
Contribution
The method introduces a novel use of sequence alignment for pattern generation in information extraction, accommodating complex annotation overlaps and variable lengths.
Findings
Effective pattern generation for overlapping annotations
Improved context window determination for NER
Validated on CoNLL-2003 dataset
Abstract
Sequence alignments are used to capture patterns composed of elements representing multiple conceptual levels through the alignment of sequences that contain overlapping and variable length annotations. The alignments also determine the proper context window of words and phrases that most directly impact the meaning of a given target within a sentence, eliminating the need to predefine a fixed context window of words surrounding the targets. We evaluated the system using the CoNLL-2003 named entity recognition (NER) task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
