Speculative Actions: A Lossless Framework for Faster Agentic Systems
By: Naimeng Ye , Arnav Ahuja , Georgios Liargkovas and more
Potential Business Impact:
AI agents act much faster by guessing ahead.
Despite growing interest in AI agents across industry and academia, their execution in an environment is often slow, hampering training, evaluation, and deployment. For example, a game of chess between two state-of-the-art agents may take hours. A critical bottleneck is that agent behavior unfolds sequentially: each action requires an API call, and these calls can be time-consuming. Inspired by speculative execution in microprocessors and speculative decoding in LLM inference, we propose speculative actions, a lossless framework for general agentic systems that predicts likely actions using faster models, enabling multiple steps to be executed in parallel. We evaluate this framework across three agentic environments: gaming, e-commerce, web search, and a "lossy" extension for an operating systems environment. In all cases, speculative actions achieve substantial accuracy in next-action prediction (up to 55%), translating into significant reductions in end-to-end latency. Moreover, performance can be further improved through stronger guessing models, top-K action prediction, multi-step speculation, and uncertainty-aware optimization, opening a promising path toward deploying low-latency agentic systems in the real world.
Similar Papers
Reducing Latency of LLM Search Agent via Speculation-based Algorithm-System Co-Design
Artificial Intelligence
Makes smart computer searches much faster.
Dynamic Speculative Agent Planning
Artificial Intelligence
Makes AI faster and cheaper to use.
Dynamic Speculative Agent Planning
Artificial Intelligence
Makes AI faster and cheaper to use.