Score: 2

MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers

Published: December 17, 2025 | arXiv ID: 2512.15163v1

By: Xuanjun Zong , Zhiqi Shen , Lei Wang and more

Potential Business Impact:

Tests AI safety with real-world tools.

Business Areas:

Simulation Software

Large language models (LLMs) are evolving into agentic systems that reason, plan, and operate external tools. The Model Context Protocol (MCP) is a key enabler of this transition, offering a standardized interface for connecting LLMs with heterogeneous tools and services. Yet MCP's openness and multi-server workflows introduce new safety risks that existing benchmarks fail to capture, as they focus on isolated attacks or lack real-world coverage. We present MCP-SafetyBench, a comprehensive benchmark built on real MCP servers that supports realistic multi-turn evaluation across five domains: browser automation, financial analysis, location navigation, repository management, and web search. It incorporates a unified taxonomy of 20 MCP attack types spanning server, host, and user sides, and includes tasks requiring multi-step reasoning and cross-server coordination under uncertainty. Using MCP-SafetyBench, we systematically evaluate leading open- and closed-source LLMs, revealing large disparities in safety performance and escalating vulnerabilities as task horizons and server interactions grow. Our results highlight the urgent need for stronger defenses and establish MCP-SafetyBench as a foundation for diagnosing and mitigating safety risks in real-world MCP deployments.

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Computation and Language

Tests AI's ability to use many tools together.

28 Aug 2025 1

93%

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

Cryptography and Security

Finds security flaws in AI tools.

17 Aug 2025 2

92%

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

Cryptography and Security

Tests if AI can use tools safely.

14 Oct 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 🇸🇬 China, Singapore

Repos / Data Links

github.com

Page Count

24 pages

MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers

Tests AI safety with real-world tools.

Technical Abstract

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents