Provably Safe Model Updates

Leo Elmecker-Plakolm; Pierre Fasterling; Philip Sosnin; Calvin Tsay; Matthew Wicker

arXiv:2512.01899·cs.LG·March 19, 2026

Provably Safe Model Updates

Leo Elmecker-Plakolm, Pierre Fasterling, Philip Sosnin, Calvin Tsay, Matthew Wicker

PDF

Open Access

TL;DR

This paper introduces a formal framework for certifying the safety of model updates in dynamic, safety-critical environments, ensuring models meet specifications despite distribution shifts.

Contribution

It formalizes the problem of safe model updates as computing the largest locally invariant domain and develops a tractable primal-dual approach for certification.

Findings

01

Efficient certification of model updates independent of data or algorithms

02

Matches or exceeds heuristic baselines in continual learning and foundation model fine-tuning

03

Provides formal safety guarantees for model updates

Abstract

Safety-critical environments are inherently dynamic. Distribution shifts, emerging vulnerabilities, and evolving requirements demand continuous updates to machine learning models. Yet even benign parameter updates can have unintended consequences, such as catastrophic forgetting in classical models or alignment drift in foundation models. Existing heuristic approaches (e.g., regularization, parameter isolation) can mitigate these effects but cannot certify that updated models continue to satisfy required performance specifications. We address this problem by introducing a framework for provably safe model updates. Our approach first formalizes the problem as computing the largest locally invariant domain (LID): a connected region in parameter space where all points are certified to satisfy a given specification. While exact maximal LID computation is intractable, we show that relaxing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms