Score: 1

Prompt to Pwn: Automated Exploit Generation for Smart Contracts

Published: August 2, 2025 | arXiv ID: 2508.01371v1

By: Zeke Xiao , Yuekang Li , Qin Wang and more

Potential Business Impact:

Finds software bugs automatically to prevent hacks.

We explore the feasibility of using LLMs for Automated Exploit Generation (AEG) against vulnerable smart contracts. We present \textsc{ReX}, a framework integrating LLM-based exploit synthesis with the Foundry testing suite, enabling the automated generation and validation of proof-of-concept (PoC) exploits. We evaluate five state-of-the-art LLMs (GPT-4.1, Gemini 2.5 Pro, Claude Opus 4, DeepSeek, and Qwen3 Plus) on both synthetic benchmarks and real-world smart contracts affected by known high-impact exploits. Our results show that modern LLMs can reliably generate functional PoC exploits for diverse vulnerability types, with success rates reaching up to 92\%. Notably, Gemini 2.5 Pro and GPT-4.1 consistently outperform others in both synthetic and real-world scenarios. We further analyze factors influencing AEG effectiveness, including model capabilities, contract structure, and vulnerability types. We also collect the first curated dataset of real-world PoC exploits to support future research.

Good News for Script Kiddies? Evaluating Large Language Models for Automated Exploit Generation

Cryptography and Security

AI can write code to break computer programs.

2 May 2025 0

88%

A Systematic Study on Generating Web Vulnerability Proof-of-Concepts Using Large Language Models

Software Engineering

Computers find security flaws automatically.

11 Oct 2025 1

88%

Evaluating LLMs for One-Shot Patching of Real and Artificial Vulnerabilities

Cryptography and Security

Fixes computer bugs automatically, better on real ones.

28 Nov 2025 0

View PDF Login to Bookmark

Page Count

11 pages

Prompt to Pwn: Automated Exploit Generation for Smart Contracts

Finds software bugs automatically to prevent hacks.

Technical Abstract

Good News for Script Kiddies? Evaluating Large Language Models for Automated Exploit Generation

A Systematic Study on Generating Web Vulnerability Proof-of-Concepts Using Large Language Models

Evaluating LLMs for One-Shot Patching of Real and Artificial Vulnerabilities