Score: 0

A Practical Framework for Evaluating Medical AI Security: Reproducible Assessment of Jailbreaking and Privacy Vulnerabilities Across Clinical Specialties

Published: December 9, 2025 | arXiv ID: 2512.08185v1

By: Jinghao Wang, Ping Zhang, Carter Yagemann

Potential Business Impact:

Tests medical AI for safety on regular computers.

Business Areas:

Machine Learning Artificial Intelligence, Data and Analytics, Software

Medical Large Language Models (LLMs) are increasingly deployed for clinical decision support across diverse specialties, yet systematic evaluation of their robustness to adversarial misuse and privacy leakage remains inaccessible to most researchers. Existing security benchmarks require GPU clusters, commercial API access, or protected health data -- barriers that limit community participation in this critical research area. We propose a practical, fully reproducible framework for evaluating medical AI security under realistic resource constraints. Our framework design covers multiple medical specialties stratified by clinical risk -- from high-risk domains such as emergency medicine and psychiatry to general practice -- addressing jailbreaking attacks (role-playing, authority impersonation, multi-turn manipulation) and privacy extraction attacks. All evaluation utilizes synthetic patient records requiring no IRB approval. The framework is designed to run entirely on consumer CPU hardware using freely available models, eliminating cost barriers. We present the framework specification including threat models, data generation methodology, evaluation protocols, and scoring rubrics. This proposal establishes a foundation for comparative security assessment of medical-specialist models and defense mechanisms, advancing the broader goal of ensuring safe and trustworthy medical AI systems.

Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare

Cryptography and Security

AI doctors can be tricked into giving bad advice.

27 Jan 2025 0

93%

Ethical Risks in Deploying Large Language Models: An Evaluation of Medical Ethics Jailbreaking

Computers and Society

AI models fail to block harmful medical advice.

19 Jan 2026 0

92%

TeleAI-Safety: A comprehensive LLM jailbreaking benchmark towards attacks, defenses, and evaluations

Cryptography and Security

Tests AI to find and fix safety problems.

5 Dec 2025 1

View PDF Login to Bookmark

Page Count

10 pages

A Practical Framework for Evaluating Medical AI Security: Reproducible Assessment of Jailbreaking and Privacy Vulnerabilities Across Clinical Specialties

Tests medical AI for safety on regular computers.

Technical Abstract

Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare

Ethical Risks in Deploying Large Language Models: An Evaluation of Medical Ethics Jailbreaking

TeleAI-Safety: A comprehensive LLM jailbreaking benchmark towards attacks, defenses, and evaluations