BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised   Named Entity Recognition

Yinghao Li; Pranav Shetty; Lucas Liu; Chao Zhang; Le Song

arXiv:2105.12848·cs.CL·June 1, 2021·1 cites

BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

Yinghao Li, Pranav Shetty, Lucas Liu, Chao Zhang, Le Song

PDF

Open Access 2 Repos

TL;DR

This paper introduces CHMM, a novel model combining hidden Markov models with BERT embeddings, to improve weakly supervised NER by effectively inferring true labels from noisy multi-source labels.

Contribution

It proposes a conditional hidden Markov model enhanced with BERT, and an iterative training method to improve weakly supervised NER performance.

Findings

01

Outperforms state-of-the-art weakly supervised NER models

02

Effective in handling noisy multi-source labels

03

Achieves significant improvements on multiple benchmarks

Abstract

We study the problem of learning a named entity recognition (NER) tagger using noisy labels from multiple weak supervision sources. Though cheap to obtain, the labels from weak supervision sources are often incomplete, inaccurate, and contradictory, making it difficult to learn an accurate NER model. To address this challenge, we propose a conditional hidden Markov model (CHMM), which can effectively infer true labels from multi-source noisy labels in an unsupervised way. CHMM enhances the classic hidden Markov model with the contextual representation power of pre-trained language models. Specifically, CHMM learns token-wise transition and emission probabilities from the BERT embeddings of the input tokens to infer the latent true labels from noisy observations. We further refine CHMM with an alternate-training approach (CHMM-ALT). It fine-tunes a BERT-NER model with the labels inferred…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Layer Normalization · Softmax · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Dropout