Bag of Lies: Robustness in Continuous Pre-training BERT

Ine Gevers; Walter Daelemans

arXiv:2406.09967·cs.CL·June 17, 2024

Bag of Lies: Robustness in Continuous Pre-training BERT

Ine Gevers, Walter Daelemans

PDF

Open Access

TL;DR

This paper investigates the robustness of continuous pre-training of BERT, especially in the context of new entity knowledge like COVID-19, revealing surprising resilience against misinformation and introducing a new dataset.

Contribution

It provides insights into how continuous pre-training affects BERT's entity knowledge and robustness, and introduces a new dataset with AI-generated false texts.

Findings

01

Pre-training does not degrade performance under adversarial input manipulations.

02

Continuous pre-training can sometimes improve downstream task performance.

03

The model shows robustness against misinformation during continuous pre-training.

Abstract

This study aims to acquire more insights into the continuous pre-training phase of BERT regarding entity knowledge, using the COVID-19 pandemic as a case study. Since the pandemic emerged after the last update of BERT's pre-training data, the model has little to no entity knowledge about COVID-19. Using continuous pre-training, we control what entity knowledge is available to the model. We compare the baseline BERT model with the further pre-trained variants on the fact-checking benchmark Check-COVID. To test the robustness of continuous pre-training, we experiment with several adversarial methods to manipulate the input data, such as training on misinformation and shuffling the word order until the input becomes nonsensical. Surprisingly, our findings reveal that these methods do not degrade, and sometimes even improve, the model's downstream performance. This suggests that continuous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Reliability and Analysis Research

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Dropout · Adam · Linear Layer · Dense Connections · Multi-Head Attention