Blue Teaming Function-Calling Agents
By: Greta Dolcetti, Giulio Zizzo, Sergio Maffeis
Potential Business Impact:
AI models can't be trusted with important jobs yet.
We present an experimental evaluation that assesses the robustness of four open source LLMs claiming function-calling capabilities against three different attacks, and we measure the effectiveness of eight different defences. Our results show how these models are not safe by default, and how the defences are not yet employable in real-world scenarios.
Similar Papers
CyberSleuth: Autonomous Blue-Team LLM Agent for Web Attack Forensics
Cryptography and Security
Helps find computer attackers and their tricks.
Bridging AI and Software Security: A Comparative Vulnerability Assessment of LLM Agent Deployment Paradigms
Cryptography and Security
Makes AI programs safer from hackers.
Measuring the Security of Mobile LLM Agents under Adversarial Prompts from Untrusted Third-Party Channels
Cryptography and Security
Finds ways bad apps trick phone AI.