Assimilation Matters: Model-level Backdoor Detection in Vision-Language Pretrained Models

Zhongqi Wang; Jie Zhang; Shiguang Shan; Xilin Chen

arXiv:2512.00343·cs.CV·December 2, 2025

Assimilation Matters: Model-level Backdoor Detection in Vision-Language Pretrained Models

Zhongqi Wang, Jie Zhang, Shiguang Shan, Xilin Chen

PDF

Open Access 1 Models

TL;DR

This paper introduces AMDET, a model-level detection framework for identifying backdoors in vision-language pretrained models without prior knowledge, using feature assimilation properties and gradient-based inversion techniques.

Contribution

AMDET is the first to detect backdoors in VLPs at the model level without relying on prior training data or trigger information, utilizing feature assimilation and gradient inversion.

Findings

01

Achieves 89.90% F1 score in backdoor detection

02

Detects backdoors within approximately 5 minutes on high-end GPU

03

Demonstrates robustness against adaptive attack strategies

Abstract

Vision-language pretrained models (VLPs) such as CLIP have achieved remarkable success, but are also highly vulnerable to backdoor attacks. Given a model fine-tuned by an untrusted third party, determining whether the model has been injected with a backdoor is a critical and challenging problem. Existing detection methods usually rely on prior knowledge of training dataset, backdoor triggers and targets, or downstream classifiers, which may be impractical for real-world applications. To address this, To address this challenge, we introduce Assimilation Matters in DETection (AMDET), a novel model-level detection framework that operates without any such prior knowledge. Specifically, we first reveal the feature assimilation property in backdoored text encoders: the representations of all tokens within a backdoor sample exhibit a high similarity. Further analysis attributes this effect to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
RobinWZQ/poisoned_model_1
model· 2 dl
2 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Topic Modeling