On a Benefit of Mask Language Modeling: Robustness to Simplicity Bias
Ting-Rui Chiang

TL;DR
This paper demonstrates that masked language model pretraining enhances robustness against simplicity bias by reducing reliance on spurious features, supported by theoretical analysis and empirical experiments in NLP tasks.
Contribution
It provides a theoretical explanation for MLM's robustness to spurious features and validates it through experiments on hate speech detection and NER tasks.
Findings
MLM pretraining reduces reliance on lexicon-level spurious features.
Theoretically, MLM makes spurious features at least as informative and simple to learn.
Experiments confirm improved robustness in hate speech detection and NER.
Abstract
Despite the success of pretrained masked language models (MLM), why MLM pretraining is useful is still a qeustion not fully answered. In this work we theoretically and empirically show that MLM pretraining makes models robust to lexicon-level spurious features, partly answer the question. We theoretically show that, when we can model the distribution of a spurious feature conditioned on the context, then (1) is at least as informative as the spurious feature, and (2) learning from is at least as simple as learning from the spurious feature. Therefore, MLM pretraining rescues the model from the simplicity bias caused by the spurious feature. We also explore the efficacy of MLM pretraing in causal settings. Finally we close the gap between our theories and the real world practices by conducting experiments on the hate speech detection and the name entity recognition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Natural Language Processing Techniques
