RoPDA: Robust Prompt-based Data Augmentation for Low-Resource Named Entity Recognition
Sihan Song, Furao Shen, Jian Zhao

TL;DR
RoPDA introduces a robust prompt-based data augmentation method for low-resource NER that improves performance by generating high-quality augmented data and effectively utilizing unlabeled data.
Contribution
It proposes a novel prompt-based augmentation framework with self-filtering and mixup techniques to enhance low-resource NER performance.
Findings
Significant performance improvements over strong baselines.
Outperforms state-of-the-art semi-supervised methods with unlabeled data.
Effective augmentation operations that preserve label integrity.
Abstract
Data augmentation has been widely used in low-resource NER tasks to tackle the problem of data sparsity. However, previous data augmentation methods have the disadvantages of disrupted syntactic structures, token-label mismatch, and requirement for external knowledge or manual effort. To address these issues, we propose Robust Prompt-based Data Augmentation (RoPDA) for low-resource NER. Based on pre-trained language models (PLMs) with continuous prompt, RoPDA performs entity augmentation and context augmentation through five fundamental augmentation operations to generate label-flipping and label-preserving examples. To optimize the utilization of the augmented samples, we present two techniques: Self-Consistency Filtering and mixup. The former effectively eliminates low-quality samples, while the latter prevents performance degradation arising from the direct utilization of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
