A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

Xingjun Ma; Yixu Wang; Hengyuan Xu; Yutao Wu; Yifan Ding; Yunhan Zhao; Zilong Wang; Jiabin Hua; Ming Wen; Jianan Liu; Ranjie Duan; Yifeng Gao; Yingshui Tan; Yunhao Chen; Hui Xue; Xin Wang; Wei Cheng; Jingjing Chen; Zuxuan Wu; Bo Li; Yu-Gang Jiang

arXiv:2601.10527·cs.AI·January 19, 2026

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

Xingjun Ma, Yixu Wang, Hengyuan Xu, Yutao Wu, Yifan Ding, Yunhan Zhao, Zilong Wang, Jiabin Hua, Ming Wen, Jianan Liu, Ranjie Duan, Yifeng Gao, Yingshui Tan, Yunhao Chen, Hui Xue, Xin Wang, Wei Cheng, Jingjing Chen, Zuxuan Wu, Bo Li, Yu-Gang Jiang

PDF

Open Access

TL;DR

This report evaluates the safety of six advanced large language and multimodal models across multiple dimensions, revealing strengths in standard benchmarks but significant vulnerabilities under adversarial conditions, emphasizing the need for comprehensive safety assessments.

Contribution

It introduces a unified safety evaluation protocol for frontier LLMs and MLLMs, providing a comparative safety landscape across multiple modalities and threat models.

Findings

01

GPT-5.2 shows balanced safety performance.

02

All models are vulnerable to adversarial attacks, with safety rates below 6%.

03

Visual models are slightly safer but still fragile under adversarial prompts.

Abstract

The rapid evolution of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has driven major gains in reasoning, perception, and generation across language and vision, yet whether these advances translate into comparable improvements in safety remains unclear, partly due to fragmented evaluations that focus on isolated modalities or threat models. In this report, we present an integrated safety evaluation of six frontier models--GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5--assessing each across language, vision-language, and image generation using a unified protocol that combines benchmark, adversarial, multilingual, and compliance evaluations. By aggregating results into safety leaderboards and model profiles, we reveal a highly uneven safety landscape: while GPT-5.2 demonstrates consistently strong and balanced performance,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Multimodal Machine Learning Applications