A Diversity-Enhanced and Constraints-Relaxed Augmentation for   Low-Resource Classification

Guang Liu; Hailong Huang; Yuzhao Mao; Weiguo Gao; Xuan Li; Jianping; Shen

arXiv:2109.11834·cs.CL·September 27, 2021

A Diversity-Enhanced and Constraints-Relaxed Augmentation for Low-Resource Classification

Guang Liu, Hailong Huang, Yuzhao Mao, Weiguo Gao, Xuan Li, Jianping, Shen

PDF

Open Access

TL;DR

This paper introduces DECRA, a novel data augmentation method for low-resource classification that enhances diversity and relaxes constraints, leading to improved classifier generalization and state-of-the-art performance.

Contribution

DECRA combines a k-beta augmentation for diversity and a masked language model loss for relaxed constraints, advancing data augmentation techniques in low-resource settings.

Findings

01

DECRA outperforms existing methods by 3.8% overall.

02

Enhanced diversity improves classifier generalization.

03

Relaxed constraints enable more effective training data generation.

Abstract

Data augmentation (DA) aims to generate constrained and diversified data to improve classifiers in Low-Resource Classification (LRC). Previous studies mostly use a fine-tuned Language Model (LM) to strengthen the constraints but ignore the fact that the potential of diversity could improve the effectiveness of generated data. In LRC, strong constraints but weak diversity in DA result in the poor generalization ability of classifiers. To address this dilemma, we propose a {D}iversity-{E}nhanced and {C}onstraints-\{R}elaxed {A}ugmentation (DECRA). Our DECRA has two essential components on top of a transformer-based backbone model. 1) A k-beta augmentation, an essential component of DECRA, is proposed to enhance the diversity in generating constrained data. It expands the changing scope and improves the degree of complexity of the generated data. 2) A masked language model loss, instead of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications