GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators
By: Jiacheng Guo , Ling Yang , Peter Chen and more
Potential Business Impact:
Teaches AI new skills by playing games.
Training capable Large Language Model (LLM) agents is critically bottlenecked by the high cost and static nature of real-world interaction data. We address this by introducing GenEnv, a framework that establishes a difficulty-aligned co-evolutionary game between an agent and a scalable, generative environment simulator. Unlike traditional methods that evolve models on static datasets, GenEnv instantiates a dataevolving: the simulator acts as a dynamic curriculum policy, continuously generating tasks specifically tailored to the agent's ``zone of proximal development''. This process is guided by a simple but effective $α$-Curriculum Reward, which aligns task difficulty with the agent's current capabilities. We evaluate GenEnv on five benchmarks, including API-Bank, ALFWorld, BFCL, Bamboogle, and TravelPlanner. Across these tasks, GenEnv improves agent performance by up to \textbf{+40.3\%} over 7B baselines and matches or exceeds the average performance of larger models. Compared to Gemini 2.5 Pro-based offline data augmentation, GenEnv achieves better performance while using 3.3$\times$ less data. By shifting from static supervision to adaptive simulation, GenEnv provides a data-efficient pathway for scaling agent capabilities.
Similar Papers
AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning
Artificial Intelligence
Teaches AI to learn in many different worlds.
Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey
Machine Learning (CS)
Teaches AI to learn by doing, not just reading.
Exploration-Driven Generative Interactive Environments
CV and Pattern Recognition
Teaches AI to learn from many virtual worlds.