Attending to Characters in Neural Sequence Labeling Models

Marek Rei; Gamal K.O. Crichton; Sampo Pyysalo

arXiv:1611.04361·cs.CL·November 15, 2016·67 cites

Attending to Characters in Neural Sequence Labeling Models

Marek Rei, Gamal K.O. Crichton, Sampo Pyysalo

PDF

Open Access

TL;DR

This paper introduces an attention-based character-level extension for neural sequence labeling models, improving performance on various datasets by dynamically integrating word and character information.

Contribution

It proposes a novel attention mechanism to effectively combine word and character representations, enhancing sequence labeling accuracy.

Findings

01

Character-level extensions improve performance across all benchmarks.

02

The attention-based architecture achieves better results with fewer parameters.

03

Dynamic integration of representations outperforms static methods.

Abstract

Sequence labeling architectures use word embeddings for capturing similarity, but suffer when handling previously unseen or rare words. We investigate character-level extensions to such models and propose a novel architecture for combining alternative word representations. By using an attention mechanism, the model is able to dynamically decide how much information to use from a word- or character-level component. We evaluated different architectures on a range of sequence labeling datasets, and character-level extensions were found to improve performance on every benchmark. In addition, the proposed attention-based architecture delivered the best results even with a smaller number of trainable parameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Handwritten Text Recognition Techniques