I-SAFE: Wasserstein Coherence Metrics for Structural Auditing of Scientific AI Models

Barbara Tarantino; Gennaro Auricchio; Paolo Giudici

arXiv:2605.21731·cs.LG·May 22, 2026

I-SAFE: Wasserstein Coherence Metrics for Structural Auditing of Scientific AI Models

Barbara Tarantino, Gennaro Auricchio, Paolo Giudici

PDF

TL;DR

The paper introduces I-SAFE, a framework for auditing scientific AI models using Wasserstein coherence metrics to evaluate distributional alignment with domain knowledge, revealing differences invisible to accuracy metrics.

Contribution

It proposes a novel post-hoc distributional auditing framework, I-SAFE, utilizing Wasserstein coherence metrics to assess model outputs against structural priors.

Findings

01

I-SAFE reveals significant differences in model behavior not captured by accuracy.

02

The framework is applicable across domains with structured inputs and external priors.

03

Applied to drug-target interaction models, it uncovers distributional discrepancies.

Abstract

Deep learning models are increasingly used in scientific prediction tasks where strong benchmark performance is often interpreted as evidence of scientifically meaningful behavior. This interpretation is fragile, as models may exploit shortcut features, dataset-specific regularities, or distributional biases that are predictive on held-out data but not aligned with domain-relevant structure. To address this limitation, we introduce the \textsc{I-SAFE} (Interventional Secure, Accurate, Fair and Explainable) framework, a post-hoc distributional auditing framework for scientific AI models centered on the Wasserstein Coherence Metric (WCM). Given a trained black-box predictor and an external structural prior encoding domain knowledge about task-relevant input structure, \textsc{I-SAFE} evaluates raw model outputs under structurally guided perturbations of the input. The proposed audit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.