Imitation Game for Adversarial Disillusion with Chain-of-Thought Reasoning in Generative AI
Ching-Chun Chang, Fan-Yun Chen, Shih-Hong Gu, Kai Gao, Hanrui Wang, Isao Echizen

TL;DR
This paper introduces a novel disillusion framework using an imitation game with chain-of-thought reasoning to defend multimodal generative AI against various adversarial illusions, effectively neutralizing attacks.
Contribution
It proposes a unified defense paradigm based on an imitation game that leverages chain-of-thought reasoning to counter diverse adversarial illusions in AI models.
Findings
The framework neutralizes both deductive and inductive adversarial illusions.
It is effective across white-box and black-box attack scenarios.
Experimental results confirm robustness of the proposed method.
Abstract
As the cornerstone of artificial intelligence, machine perception confronts a fundamental threat posed by adversarial illusions. These adversarial attacks manifest in two primary forms: deductive illusion, where specific stimuli are crafted based on the victim model's general decision logic, and inductive illusion, where the victim model's general decision logic is shaped by specific stimuli. The former exploits the model's decision boundaries to create a stimulus that, when applied, interferes with its decision-making process. The latter reinforces a conditioned reflex in the model, embedding a backdoor during its learning phase that, when triggered by a stimulus, causes aberrant behaviours. The multifaceted nature of adversarial illusions calls for a unified defence framework, addressing vulnerabilities across various forms of attack. In this study, we propose a disillusion paradigm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
