Counterfactual Explanations for Natural Language Interfaces
George Tolkachev, Stephen Mell, Steve Zdancewic, Osbert Bastani

TL;DR
This paper introduces a method for generating counterfactual explanations in natural language interfaces, helping users understand system capabilities by showing minimal utterance modifications to achieve goals.
Contribution
It presents a novel semantic parsing-based approach for counterfactual explanations, improving user understanding and system transparency in natural language interfaces.
Findings
User performance improved significantly with the approach
Generated explanations closely match user intent
Method outperforms ablation baselines
Abstract
A key challenge facing natural language interfaces is enabling users to understand the capabilities of the underlying system. We propose a novel approach for generating explanations of a natural language interface based on semantic parsing. We focus on counterfactual explanations, which are post-hoc explanations that describe to the user how they could have minimally modified their utterance to achieve their desired goal. In particular, the user provides an utterance along with a demonstration of their desired goal; then, our algorithm synthesizes a paraphrase of their utterance that is guaranteed to achieve their goal. In two user studies, we demonstrate that our approach substantially improves user performance, and that it generates explanations that more closely match the user's intent compared to two ablations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Business Process Modeling and Analysis
