Evaluating Black-Box Vulnerabilities with Wasserstein-Constrained Data Perturbations

Adriana Laurindo Monteiro; Jean-Michel Loubes

arXiv:2603.15867·cs.LG·April 23, 2026

Evaluating Black-Box Vulnerabilities with Wasserstein-Constrained Data Perturbations

Adriana Laurindo Monteiro, Jean-Michel Loubes

PDF

TL;DR

This paper introduces a model-agnostic framework using Wasserstein constraints to analyze ML robustness against realistic data perturbations, enhancing explainability and fairness diagnostics.

Contribution

It presents a novel approach combining Optimal Transport and Distributionally Robust Optimization for realistic, feature-level constrained data perturbations.

Findings

01

Provides a theoretical guarantee for the robustness diagnostics.

02

Validates the approach on real-world datasets in tabular and image domains.

03

Offers a diagnostic tool that complements existing evaluation methods.

Abstract

The growing use of Machine Learning (ML) tools comes with critical challenges, such as limited model explainability. We propose a global explainability framework that leverages Optimal Transport and Distributionally Robust Optimization to analyze how ML algorithms respond to constrained data perturbations. Our approach enforces constraints on feature-level statistics (e.g., brightness, age distribution), generating realistic perturbations that preserve semantic structure. We provide a model-agnostic diagnostic bench that applies to both tabular and image domains with solid theoretical guarantees. We validate the approach on real-world datasets providing interpretable robustness diagnostics that complement standard evaluation and fairness auditing tools.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.