A Global Context Mechanism for Sequence Labeling
Conglei Xu, Kun Shen, Hongguang Sun, Yang Xu

TL;DR
This paper introduces a simple, efficient global context mechanism that enhances sequence labeling models like BiLSTM and transformers, improving accuracy with minimal speed impact across multiple benchmarks.
Contribution
The proposed mechanism effectively supplements global sentence information into existing models, addressing speed and compatibility issues of previous methods.
Findings
Significant F1 score improvements on seven benchmarks.
Achieves third highest score on Weibo NER benchmark.
Maintains high inference and training speed compared to CRF.
Abstract
Global sentence information is crucial for sequence labeling tasks, where each word in a sentence must be assigned a label. While BiLSTM models are widely used, they often fail to capture sufficient global context for inner words. Previous work has proposed various RNN variants to integrate global sentence information into word representations. However, these approaches suffer from three key limitations: (1) they are slower in both inference and training compared to the original BiLSTM, (2) they cannot effectively supplement global information for transformer-based models, and (3) the high time cost associated with reimplementing and integrating these customized RNNs into existing architectures. In this study, we introduce a simple yet effective mechanism that addresses these limitations. Our approach efficiently supplements global sentence information for both BiLSTM and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Tanh Activation · Sigmoid Activation · Linear Layer · Long Short-Term Memory · Layer Normalization · Softmax · Bidirectional LSTM
