Agents of Change: Self-Evolving LLM Agents for Strategic Planning
By: Nikolas Belle , Dakota Barnes , Alfonso Amayuelas and more
Potential Business Impact:
Computers learn to play games better by fixing themselves.
Recent advances in LLMs have enabled their use as autonomous agents across a range of tasks, yet they continue to struggle with formulating and adhering to coherent long-term strategies. In this paper, we investigate whether LLM agents can self-improve when placed in environments that explicitly challenge their strategic planning abilities. Using the board game Settlers of Catan, accessed through the open-source Catanatron framework, we benchmark a progression of LLM-based agents, from a simple game-playing agent to systems capable of autonomously rewriting their own prompts and their player agent's code. We introduce a multi-agent architecture in which specialized roles (Analyzer, Researcher, Coder, and Player) collaborate to iteratively analyze gameplay, research new strategies, and modify the agent's logic or prompt. By comparing manually crafted agents to those evolved entirely by LLMs, we evaluate how effectively these systems can diagnose failure and adapt over time. Our results show that self-evolving agents, particularly when powered by models like Claude 3.7 and GPT-4o, outperform static baselines by autonomously adopting their strategies, passing along sample behavior to game-playing agents, and demonstrating adaptive reasoning over multiple iterations.
Similar Papers
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Artificial Intelligence
Computers learn and improve by themselves.
Survey on Evaluation of LLM-based Agents
Artificial Intelligence
Tests how smart AI agents can act and learn.
A Self-Improving Coding Agent
Artificial Intelligence
Computers fix themselves to do tasks better.