CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain   Performance and Calibration

Rachneet Sachdeva; Martin Tutek; Iryna Gurevych

arXiv:2309.07822·cs.CL·February 14, 2024

CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration

Rachneet Sachdeva, Martin Tutek, Iryna Gurevych

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method using large language models to generate counterfactual data for small models, significantly enhancing out-of-domain question answering performance and calibration by increasing data diversity and explanation conciseness.

Contribution

It presents a novel data augmentation technique with counterfactual instances generated by LLMs to improve out-of-domain performance and calibration of small language models.

Findings

01

Counterfactual augmentation improves OOD QA performance.

02

Enhanced calibration correlates with diverse CF instances.

03

Calibrated models show lower entropy in importance attribution.

Abstract

In recent years, large language models (LLMs) have shown remarkable capabilities at scale, particularly at generating text conditioned on a prompt. In our work, we investigate the use of LLMs to augment training data of small language models~(SLMs) with automatically generated counterfactual~(CF) instances -- i.e. minimally altered inputs -- in order to improve out-of-domain~(OOD) performance of SLMs in the extractive question answering~(QA) setup. We show that, across various LLM generators, such data augmentation consistently enhances OOD performance and improves model calibration for both confidence-based and rationale-augmented calibrator models. Furthermore, these performance improvements correlate with higher diversity of CF instances in terms of their surface form and semantic content. Finally, we show that CF augmented models which are easier to calibrate also exhibit much lower…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ukplab/catfood
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)