LANCE: Stress-testing Visual Models by Generating Language-guided   Counterfactual Images

Viraj Prabhu; Sriram Yenamandra; Prithvijit Chattopadhyay; Judy; Hoffman

arXiv:2305.19164·cs.CV·October 31, 2023·6 cites

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Viraj Prabhu, Sriram Yenamandra, Prithvijit Chattopadhyay, Judy, Hoffman

PDF

Open Access 2 Repos 1 Video

TL;DR

LANCE is an automated method that uses language-guided image editing to generate challenging counterfactual images for stress-testing visual models, revealing their vulnerabilities and biases without retraining.

Contribution

It introduces a novel approach combining large language models and image editing to create diverse, realistic test cases for evaluating visual model robustness.

Findings

01

Models show significant performance drops on generated images.

02

The method uncovers previously unknown class biases.

03

It effectively highlights model sensitivities to specific edits.

Abstract

We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pre-trained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet. Code is available at https://github.com/virajprabhu/lance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images· slideslive

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Radiomics and Machine Learning in Medical Imaging