Model Monitoring in the Absence of Labeled Data via Feature Attributions   Distributions

Carlos Mougan

arXiv:2501.10774·cs.LG·January 28, 2025

Model Monitoring in the Absence of Labeled Data via Feature Attributions Distributions

Carlos Mougan

PDF

Open Access

TL;DR

This paper proposes a method for monitoring AI models without labeled data by analyzing feature attribution distributions, providing theoretical guarantees for AI alignment and performance monitoring.

Contribution

It introduces a novel approach using feature attribution distributions for model monitoring in unlabeled settings, with theoretical insights and guarantees.

Findings

01

Effective detection of model behavior changes without labels

02

Theoretical guarantees for feature attribution-based monitoring

03

Applicability to AI alignment and performance metrics

Abstract

Model monitoring involves analyzing AI algorithms once they have been deployed and detecting changes in their behaviour. This thesis explores machine learning model monitoring ML before the predictions impact real-world decisions or users. This step is characterized by one particular condition: the absence of labelled data at test time, which makes it challenging, even often impossible, to calculate performance metrics. The thesis is structured around two main themes: (i) AI alignment, measuring if AI models behave in a manner consistent with human values and (ii) performance monitoring, measuring if the models achieve specific accuracy goals or desires. The thesis uses a common methodology that unifies all its sections. It explores feature attribution distributions for both monitoring dimensions. Using these feature attribution explanations, we can exploit their theoretical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Bayesian Modeling and Causal Inference