Generating Information Extraction Patterns from Overlapping and Variable   Length Annotations using Sequence Alignment

Frank Meng; Craig A. Morioka; Danne C. Elbers

arXiv:1908.03594·cs.CL·September 19, 2019

Generating Information Extraction Patterns from Overlapping and Variable Length Annotations using Sequence Alignment

Frank Meng, Craig A. Morioka, Danne C. Elbers

PDF

Open Access

TL;DR

This paper presents a sequence alignment approach to generate information extraction patterns that handle overlapping and variable-length annotations, improving context understanding without fixed window constraints, demonstrated on NER tasks.

Contribution

The method introduces a novel use of sequence alignment for pattern generation in information extraction, accommodating complex annotation overlaps and variable lengths.

Findings

01

Effective pattern generation for overlapping annotations

02

Improved context window determination for NER

03

Validated on CoNLL-2003 dataset

Abstract

Sequence alignments are used to capture patterns composed of elements representing multiple conceptual levels through the alignment of sequences that contain overlapping and variable length annotations. The alignments also determine the proper context window of words and phrases that most directly impact the meaning of a given target within a sentence, eliminating the need to predefine a fixed context window of words surrounding the targets. We evaluated the system using the CoNLL-2003 named entity recognition (NER) task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques