TL;DR
This paper introduces Sparse-CHMM, a novel weakly supervised NER model that estimates label function reliabilities to improve entity recognition accuracy without manual annotations.
Contribution
The paper proposes Sparse-CHMM, which estimates LF reliability scores via a sparse diagonal emission matrix and incorporates weighted XOR scores, enhancing weakly supervised NER.
Findings
Achieves 3.01 F1 score improvement over baselines
LF reliabilities strongly correlate with true F1 scores
Each component of Sparse-CHMM is effective
Abstract
Weakly supervised named entity recognition methods train label models to aggregate the token annotations of multiple noisy labeling functions (LFs) without seeing any manually annotated labels. To work well, the label model needs to contextually identify and emphasize well-performed LFs while down-weighting the under-performers. However, evaluating the LFs is challenging due to the lack of ground truths. To address this issue, we propose the sparse conditional hidden Markov model (Sparse-CHMM). Instead of predicting the entire emission matrix as other HMM-based methods, Sparse-CHMM focuses on estimating its diagonal elements, which are considered as the reliability scores of the LFs. The sparse scores are then expanded to the full-fledged emission matrix with pre-defined expansion functions. We also augment the emission with weighted XOR scores, which track the probabilities of an LF…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
