Invariant Language Modeling

Maxime Peyrard; Sarvjeet Singh Ghotra; Martin Josifoski; Vidhan; Agarwal; Barun Patra; Dean Carignan; Emre Kiciman; Robert West

arXiv:2110.08413·cs.CL·November 16, 2022·1 cites

Invariant Language Modeling

Maxime Peyrard, Sarvjeet Singh Ghotra, Martin Josifoski, Vidhan, Agarwal, Barun Patra, Dean Carignan, Emre Kiciman, Robert West

PDF

Open Access 1 Repo

TL;DR

This paper introduces invariant language modeling, a new framework inspired by causal learning principles, to improve out-of-domain generalization and reduce biases in large pretrained language models through invariant representations.

Contribution

It adapts a game-theoretic IRM approach to language models, enabling invariant representations that enhance robustness and mitigate spurious correlations.

Findings

01

Removes structured noise effectively

02

Ignores specific spurious correlations without harming performance

03

Achieves better out-of-domain generalization

Abstract

Large pretrained language models are critical components of modern NLP pipelines. Yet, they suffer from spurious correlations, poor out-of-domain generalization, and biases. Inspired by recent progress in causal machine learning, in particular the invariant risk minimization (IRM) paradigm, we propose invariant language modeling, a framework for learning invariant representations that generalize better across multiple environments. In particular, we adapt a game-theoretic formulation of IRM (IRM-games) to language models, where the invariance emerges from a specific training schedule in which all the environments compete to optimize their own environment-specific loss by updating subsets of the model in a round-robin fashion. We focus on controlled experiments to precisely demonstrate the ability of our method to (i) remove structured noise, (ii) ignore specific spurious correlations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

epfl-dlab/invariant-language-models
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning