Back-Translated Task Adaptive Pretraining: Improving Accuracy and   Robustness on Text Classification

Junghoon Lee; Jounghee Kim; Pilsung Kang

arXiv:2107.10474·cs.CL·July 23, 2021·6 cites

Back-Translated Task Adaptive Pretraining: Improving Accuracy and Robustness on Text Classification

Junghoon Lee, Jounghee Kim, Pilsung Kang

PDF

Open Access

TL;DR

This paper introduces BT-TAPT, a novel data augmentation technique using back-translation to enhance language model pretraining, leading to improved accuracy and robustness in text classification tasks.

Contribution

The paper proposes BT-TAPT, a back-translation based data augmentation method that increases task-specific data for adaptive pretraining, addressing underfitting issues.

Findings

01

Improves classification accuracy on low-resource and high-resource datasets.

02

Enhances robustness of language models to noise.

03

Outperforms conventional adaptive pretraining methods.

Abstract

Language models (LMs) pretrained on a large text corpus and fine-tuned on a downstream text corpus and fine-tuned on a downstream task becomes a de facto training strategy for several natural language processing (NLP) tasks. Recently, an adaptive pretraining method retraining the pretrained language model with task-relevant data has shown significant performance improvements. However, current adaptive pretraining methods suffer from underfitting on the task distribution owing to a relatively small amount of data to re-pretrain the LM. To completely use the concept of adaptive pretraining, we propose a back-translated task-adaptive pretraining (BT-TAPT) method that increases the amount of task-specific data for LM re-pretraining by augmenting the task data using back-translation to generalize the LM to the target task domain. The experimental results show that the proposed BT-TAPT yields…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications