# Generating Information Extraction Patterns from Overlapping and Variable   Length Annotations using Sequence Alignment

**Authors:** Frank Meng, Craig A. Morioka, Danne C. Elbers

arXiv: 1908.03594 · 2019-09-19

## TL;DR

This paper presents a sequence alignment approach to generate information extraction patterns that handle overlapping and variable-length annotations, improving context understanding without fixed window constraints, demonstrated on NER tasks.

## Contribution

The method introduces a novel use of sequence alignment for pattern generation in information extraction, accommodating complex annotation overlaps and variable lengths.

## Key findings

- Effective pattern generation for overlapping annotations
- Improved context window determination for NER
- Validated on CoNLL-2003 dataset

## Abstract

Sequence alignments are used to capture patterns composed of elements representing multiple conceptual levels through the alignment of sequences that contain overlapping and variable length annotations. The alignments also determine the proper context window of words and phrases that most directly impact the meaning of a given target within a sentence, eliminating the need to predefine a fixed context window of words surrounding the targets. We evaluated the system using the CoNLL-2003 named entity recognition (NER) task.

---
Source: https://tomesphere.com/paper/1908.03594