DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling
Baojun Wang, Zhao Zhang, Kun Xu, Guang-Yuan Hao, Yuyang Zhang, Lifeng, Shang, Linlin Li, Xiao Chen, Xin Jiang, Qun Liu

TL;DR
DyLex introduces a novel approach to incorporate dynamic lexicons into BERT for sequence labeling, effectively handling large-scale lexicons and reducing noise, leading to state-of-the-art results across multiple datasets.
Contribution
The paper presents DyLex, a plug-in framework that uses word-agnostic tag embeddings and a denoising method to incorporate dynamic lexicons into BERT without retraining.
Findings
Achieves new state-of-the-art performance on ten datasets
Effectively handles large-scale lexicons with reduced noise
Demonstrates robustness across three sequence labeling tasks
Abstract
Incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks. However, previous works commonly have difficulty dealing with large-scale dynamic lexicons which often cause excessive matching noise and problems of frequent updates. In this paper, we propose DyLex, a plug-in lexicon incorporation approach for BERT based sequence labeling tasks. Instead of leveraging embeddings of words in the lexicon as in conventional methods, we adopt word-agnostic tag embeddings to avoid re-training the representation while updating the lexicon. Moreover, we employ an effective supervised lexical knowledge denoising method to smooth out matching noise. Finally, we introduce a col-wise attention based knowledge fusion mechanism to guarantee the pluggability of the proposed framework. Experiments on ten datasets of three tasks show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · WordPiece · Layer Normalization · Dense Connections · Attention Dropout · Linear Warmup With Linear Decay
