Model-Free Counterfactual Subset Selection at Scale
Minh Hieu Nguyen, Viet Hung Doan, Anh Tuan Nguyen, Jun Jo and, Quoc Viet Hung Nguyen

TL;DR
This paper presents a scalable, model-free algorithm for selecting diverse, relevant counterfactual explanations directly from streaming data, enabling real-time, interpretable AI decisions without relying on synthetic data or full dataset storage.
Contribution
It introduces a novel streaming algorithm for counterfactual selection that is efficient, model-free, and maintains high-quality explanations in real-time environments.
Findings
Outperforms baseline methods on real-world and synthetic datasets.
Maintains $O("log k)$ update complexity per item.
Demonstrates robustness under adversarial conditions.
Abstract
Ensuring transparency in AI decision-making requires interpretable explanations, particularly at the instance level. Counterfactual explanations are a powerful tool for this purpose, but existing techniques frequently depend on synthetic examples, introducing biases from unrealistic assumptions, flawed models, or skewed data. Many methods also assume full dataset availability, an impractical constraint in real-time environments where data flows continuously. In contrast, streaming explanations offer adaptive, real-time insights without requiring persistent storage of the entire dataset. This work introduces a scalable, model-free approach to selecting diverse and relevant counterfactual examples directly from observed data. Our algorithm operates efficiently in streaming settings, maintaining update complexity per item while ensuring high-quality counterfactual selection.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Fault Detection and Control Systems · Machine Learning and Data Classification
