Score: 1

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

Published: January 15, 2026 | arXiv ID: 2601.10527v2

By: Xingjun Ma , Yixu Wang , Hengyuan Xu and more

Potential Business Impact:

Tests AI models for safety across many tasks.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The rapid evolution of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has driven major gains in reasoning, perception, and generation across language and vision, yet whether these advances translate into comparable improvements in safety remains unclear, partly due to fragmented evaluations that focus on isolated modalities or threat models. In this report, we present an integrated safety evaluation of six frontier models--GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5--assessing each across language, vision-language, and image generation using a unified protocol that combines benchmark, adversarial, multilingual, and compliance evaluations. By aggregating results into safety leaderboards and model profiles, we reveal a highly uneven safety landscape: while GPT-5.2 demonstrates consistently strong and balanced performance, other models exhibit clear trade-offs across benchmark safety, adversarial robustness, multilingual generalization, and regulatory compliance. Despite strong results under standard benchmarks, all models remain highly vulnerable under adversarial testing, with worst-case safety rates dropping below 6%. Text-to-image models show slightly stronger alignment in regulated visual risk categories, yet remain fragile when faced with adversarial or semantically ambiguous prompts. Overall, these findings highlight that safety in frontier models is inherently multidimensional--shaped by modality, language, and evaluation design--underscoring the need for standardized, holistic safety assessments to better reflect real-world risk and guide responsible deployment.

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

Artificial Intelligence

Tests AI for safety across many tasks.

15 Jan 2026 1

90%

Evaluating Adversarial Vulnerabilities in Modern Large Language Models

Cryptography and Security

Finds ways to trick AI into saying bad things.

21 Nov 2025 0

90%

Evaluating the Robustness of Large Language Model Safety Guardrails Against Adversarial Attacks

Cryptography and Security

Makes AI safer from bad instructions.

27 Nov 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

41 pages

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

Tests AI models for safety across many tasks.

Technical Abstract

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

Evaluating Adversarial Vulnerabilities in Modern Large Language Models

Evaluating the Robustness of Large Language Model Safety Guardrails Against Adversarial Attacks