SafeFix: Targeted Model Repair via Controlled Image Generation
Ouyang Xu, Baoming Zhang, Ruiyu Mao, Yunhui Guo

TL;DR
SafeFix introduces a novel approach for repairing visual recognition models by generating targeted, semantically faithful images for failure cases using controlled image generation and filtering, leading to improved robustness.
Contribution
It presents a new model repair method leveraging interpretable failure attribution and conditional image generation to address systematic errors in visual models.
Findings
Reduces errors on rare failure cases significantly.
Maintains model accuracy and avoids introducing new bugs.
Enhances robustness through targeted synthetic data augmentation.
Abstract
Deep learning models for visual recognition often exhibit systematic errors due to underrepresented semantic subpopulations. Although existing debugging frameworks can pinpoint these failures by identifying key failure attributes, repairing the model effectively remains difficult. Current solutions often rely on manually designed prompts to generate synthetic training images -- an approach prone to distribution shift and semantic errors. To overcome these challenges, we introduce a model repair module that builds on an interpretable failure attribution pipeline. Our approach uses a conditional text-to-image model to generate semantically faithful and targeted images for failure cases. To preserve the quality and relevance of the generated samples, we further employ a large vision-language model (LVLM) to filter the outputs, enforcing alignment with the original data distribution and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
