SmoothGuard: Defending Multimodal Large Language Models with Noise Perturbation and Clustering Aggregation

Guangzhi Su; Shuchang Huang; Yutong Ke; Zhuohang Liu; Long Qian; Kaizhu Huang

arXiv:2510.26830·cs.LG·November 3, 2025

SmoothGuard: Defending Multimodal Large Language Models with Noise Perturbation and Clustering Aggregation

Guangzhi Su, Shuchang Huang, Yutong Ke, Zhuohang Liu, Long Qian, Kaizhu Huang

PDF

Open Access

TL;DR

SmoothGuard is a novel defense framework that enhances multimodal large language models' robustness against adversarial attacks by using noise perturbation and clustering aggregation, ensuring stable outputs without sacrificing performance.

Contribution

The paper introduces SmoothGuard, a lightweight, model-agnostic method combining noise injection and clustering to defend multimodal models from adversarial manipulations.

Findings

01

Improves robustness of MLLMs against adversarial attacks.

02

Maintains competitive utility while enhancing security.

03

Identifies optimal noise levels for balancing robustness and performance.

Abstract

Multimodal large language models (MLLMs) have achieved impressive performance across diverse tasks by jointly reasoning over textual and visual inputs. Despite their success, these models remain highly vulnerable to adversarial manipulations, raising concerns about their safety and reliability in deployment. In this work, we first generalize an approach for generating adversarial images within the HuggingFace ecosystem and then introduce SmoothGuard, a lightweight and model-agnostic defense framework that enhances the robustness of MLLMs through randomized noise injection and clustering-based prediction aggregation. Our method perturbs continuous modalities (e.g., images and audio) with Gaussian noise, generates multiple candidate outputs, and applies embedding-based clustering to filter out adversarially influenced predictions. The final answer is selected from the majority cluster,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Topic Modeling