Improving Users' Mental Model with Attention-directed Counterfactual Edits
Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen P., Schulze, Yi Yao, Giedrius T. Burachas

TL;DR
This paper demonstrates that controlled counterfactual examples, generated or retrieved, significantly enhance users' understanding of VQA systems by improving their ability to predict system answers on altered images.
Contribution
It introduces a method for improving users' mental models in VQA through attention-directed counterfactual edits, comparing generative and retrieval-based approaches.
Findings
Counterfactual examples improve user prediction accuracy.
Generative and retrieval approaches are equally effective.
Attention-based selection enhances counterfactual relevance.
Abstract
In the domain of Visual Question Answering (VQA), studies have shown improvement in users' mental model of the VQA system when they are exposed to examples of how these systems answer certain Image-Question (IQ) pairs. In this work, we show that showing controlled counterfactual image-question examples are more effective at improving the mental model of users as compared to simply showing random examples. We compare a generative approach and a retrieval-based approach to show counterfactual examples. We use recent advances in generative adversarial networks (GANs) to generate counterfactual images by deleting and inpainting certain regions of interest in the image. We then expose users to changes in the VQA system's answer on those altered images. To select the region of interest for inpainting, we experiment with using both human-annotated attention maps and a fully automatic method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest · Counterfactuals Explanations · Inpainting
