Prompt to Pwn: Automated Exploit Generation for Smart Contracts
By: Zeke Xiao , Yuekang Li , Qin Wang and more
Potential Business Impact:
Finds software bugs automatically to prevent hacks.
We explore the feasibility of using LLMs for Automated Exploit Generation (AEG) against vulnerable smart contracts. We present \textsc{ReX}, a framework integrating LLM-based exploit synthesis with the Foundry testing suite, enabling the automated generation and validation of proof-of-concept (PoC) exploits. We evaluate five state-of-the-art LLMs (GPT-4.1, Gemini 2.5 Pro, Claude Opus 4, DeepSeek, and Qwen3 Plus) on both synthetic benchmarks and real-world smart contracts affected by known high-impact exploits. Our results show that modern LLMs can reliably generate functional PoC exploits for diverse vulnerability types, with success rates reaching up to 92\%. Notably, Gemini 2.5 Pro and GPT-4.1 consistently outperform others in both synthetic and real-world scenarios. We further analyze factors influencing AEG effectiveness, including model capabilities, contract structure, and vulnerability types. We also collect the first curated dataset of real-world PoC exploits to support future research.
Similar Papers
Good News for Script Kiddies? Evaluating Large Language Models for Automated Exploit Generation
Cryptography and Security
AI can write code to break computer programs.
A Systematic Study on Generating Web Vulnerability Proof-of-Concepts Using Large Language Models
Software Engineering
Computers find security flaws automatically.
Evaluating LLMs for One-Shot Patching of Real and Artificial Vulnerabilities
Cryptography and Security
Fixes computer bugs automatically, better on real ones.