RAZOR: Ratio-Aware Layer Editing for Targeted Unlearning in Vision Transformers and Diffusion Models
Ravi Ranjan, Utkarsh Grover, Xiaomin Lin, Agoritsa Polyzou

TL;DR
RAZOR is a lightweight, model-agnostic framework for precise and efficient unlearning in transformer-based vision models, capable of removing sensitive information without retraining.
Contribution
It introduces a novel ratio-aware method for targeted multi-layer and multi-head unlearning, improving accuracy, stability, and speed over prior approaches.
Findings
Achieves highly accurate forgetting in vision transformers and diffusion models.
Operates faster and with better retention than existing unlearning methods.
Effective even under model quantization.
Abstract
Transformer based diffusion and vision-language models have achieved remarkable success; yet, efficiently removing undesirable or sensitive information without retraining remains a central challenge for model safety and compliance. We introduce Ratio-Aware Zero/One-step Optimized Retentive unlearning (RAZOR), a lightweight, model-agnostic unlearning framework that generalizes forgetting updates to coordinated multi-layer and multi-head edits within transformer backbones. RAZOR identifies the most important layers and attention heads by measuring how much they contribute to forgetting the target data while preserving useful knowledge. Then, it updates these parts of the model using a carefully regularized rule to avoid harming overall performance. The set of edited components grows gradually, ensuring precise unlearning without over-editing or damaging unrelated capabilities. We evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
