Attending to Characters in Neural Sequence Labeling Models
Marek Rei, Gamal K.O. Crichton, Sampo Pyysalo

TL;DR
This paper introduces an attention-based character-level extension for neural sequence labeling models, improving performance on various datasets by dynamically integrating word and character information.
Contribution
It proposes a novel attention mechanism to effectively combine word and character representations, enhancing sequence labeling accuracy.
Findings
Character-level extensions improve performance across all benchmarks.
The attention-based architecture achieves better results with fewer parameters.
Dynamic integration of representations outperforms static methods.
Abstract
Sequence labeling architectures use word embeddings for capturing similarity, but suffer when handling previously unseen or rare words. We investigate character-level extensions to such models and propose a novel architecture for combining alternative word representations. By using an attention mechanism, the model is able to dynamically decide how much information to use from a word- or character-level component. We evaluated different architectures on a range of sequence labeling datasets, and character-level extensions were found to improve performance on every benchmark. In addition, the proposed attention-based architecture delivered the best results even with a smaller number of trainable parameters.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Handwritten Text Recognition Techniques
