KAIROS: Scalable Model-Agnostic Data Valuation
Jiongli Zhu, Parjanya Prajakta Prashant, Alex Cloninger, Babak Salimi

TL;DR
KAIROS is a scalable, model-agnostic data valuation framework that accurately ranks training examples by their contribution to model utility using a closed-form MMD-based influence score, outperforming existing methods in speed and accuracy.
Contribution
We introduce KAIROS, a novel MMD-based influence scoring method that provides accurate, scalable, and model-agnostic data valuation with theoretical guarantees and efficient online updates.
Findings
KAIROS outperforms state-of-the-art baselines in accuracy and runtime.
The influence scores closely approximate true leave-one-out utility rankings.
KAIROS supports efficient online updates with significant speedups.
Abstract
Training data increasingly shapes not only model accuracy but also regulatory compliance and market valuation of AI assets. Yet existing valuation methods remain inadequate: model-based techniques depend on a single fitted model and inherit its biases, while algorithm-based approaches such as Data Shapley require costly retrainings at web scale. Recent Wasserstein-based model-agnostic methods rely on approximations that misrank examples relative to their true leave-one-out (LOO) utility. We introduce KAIROS, a scalable, model-agnostic valuation framework that assigns each example a distributional influence score: its contribution to the Maximum Mean Discrepancy (MMD) between the empirical training distribution and a clean reference set. Unlike Wasserstein surrogates, our MMD-based influence admits a closed-form solution that faithfully approximates the exact LOO ranking within…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
