SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup
Rongzhi Zhang, Yue Yu, Chao Zhang

TL;DR
SeqMix introduces a data augmentation technique using sequence mixup to enhance active sequence labeling efficiency, significantly improving F1 scores in NER and Event Detection tasks by generating plausible labeled sequences.
Contribution
The paper presents SeqMix, a novel sequence mixup method with a discriminator to generate plausible labeled sequences, boosting active sequence labeling performance.
Findings
SeqMix improves F1 scores by 2.27%–3.75%.
Sequence and token-level label mixup enhances label efficiency.
Discriminator effectively judges sequence plausibility.
Abstract
Active learning is an important technique for low-resource sequence labeling tasks. However, current active sequence labeling methods use the queried samples alone in each iteration, which is an inefficient way of leveraging human annotations. We propose a simple but effective data augmentation method to improve the label efficiency of active sequence labeling. Our method, SeqMix, simply augments the queried samples by generating extra labeled sequences in each iteration. The key difficulty is to generate plausible sequences along with token-level labels. In SeqMix, we address this challenge by performing mixup for both sequences and token-level labels of the queried samples. Furthermore, we design a discriminator during sequence mixup, which judges whether the generated sequences are plausible or not. Our experiments on Named Entity Recognition and Event Detection tasks show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms
MethodsMixup
