On the Adversarial Robustness of 3D Large Vision-Language Models

Chao Liu; Ngai-Man Cheung

arXiv:2601.06464·cs.CV·January 13, 2026

On the Adversarial Robustness of 3D Large Vision-Language Models

Chao Liu, Ngai-Man Cheung

PDF

Open Access

TL;DR

This paper systematically studies the adversarial robustness of 3D vision-language models, revealing significant vulnerabilities under untargeted attacks and emphasizing the need for robustness improvements in safety-critical applications.

Contribution

It introduces the first systematic analysis of adversarial attacks on 3D VLMs and proposes two novel attack strategies to evaluate their robustness.

Findings

01

3D VLMs are vulnerable to untargeted adversarial attacks

02

They show more resilience against targeted attacks compared to 2D models

03

Highlighting the need for robustness improvements in 3D VLMs

Abstract

3D Vision-Language Models (VLMs), such as PointLLM and GPT4Point, have shown strong reasoning and generalization abilities in 3D understanding tasks. However, their adversarial robustness remains largely unexplored. Prior work in 2D VLMs has shown that the integration of visual inputs significantly increases vulnerability to adversarial attacks, making these models easier to manipulate into generating toxic or misleading outputs. In this paper, we investigate whether incorporating 3D vision similarly compromises the robustness of 3D VLMs. To this end, we present the first systematic study of adversarial robustness in point-based 3D VLMs. We propose two complementary attack strategies: \textit{Vision Attack}, which perturbs the visual token features produced by the 3D encoder and projector to assess the robustness of vision-language alignment; and \textit{Caption Attack}, which directly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Multimodal Machine Learning Applications