Score: 2

CuES: A Curiosity-driven and Environment-grounded Synthesis Framework for Agentic RL

Published: December 1, 2025 | arXiv ID: 2512.01311v1

By: Shinji Mai , Yunpeng Zhai , Ziqian Chen and more

BigTech Affiliations: Alibaba

Potential Business Impact:

Teaches AI to invent its own learning games.

Business Areas:

Artificial Intelligence Artificial Intelligence, Data and Analytics, Science and Engineering, Software

Large language model based agents are increasingly deployed in complex, tool augmented environments. While reinforcement learning provides a principled mechanism for such agents to improve through interaction, its effectiveness critically depends on the availability of structured training tasks. In many realistic settings, however, no such tasks exist a challenge we term task scarcity, which has become a key bottleneck for scaling agentic RL. Existing approaches typically assume predefined task collections, an assumption that fails in novel environments where tool semantics and affordances are initially unknown. To address this limitation, we formalize the problem of Task Generation for Agentic RL, where an agent must learn within a given environment that lacks predefined tasks. We propose CuES, a Curiosity driven and Environment grounded Synthesis framework that autonomously generates diverse, executable, and meaningful tasks directly from the environment structure and affordances, without relying on handcrafted seeds or external corpora. CuES drives exploration through intrinsic curiosity, abstracts interaction patterns into reusable task schemas, and refines them through lightweight top down guidance and memory based quality control. Across three representative environments, AppWorld, BFCL, and WebShop, CuES produces task distributions that match or surpass manually curated datasets in both diversity and executability, yielding substantial downstream policy improvements. These results demonstrate that curiosity driven, environment grounded task generation provides a scalable foundation for agents that not only learn how to act, but also learn what to learn. The code is available at https://github.com/modelscope/AgentEvolver/research/CuES.

Beyond Fixed Tasks: Unsupervised Environment Design for Task-Level Pairs

Machine Learning (CS)

Teaches robots to solve hard problems automatically.

16 Nov 2025 3

88%

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Artificial Intelligence

Computers learn new software by trying and failing.

6 Aug 2025 1

87%

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Artificial Intelligence

Computers learn to use new programs by trying.

6 Aug 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com

Page Count

13 pages

CuES: A Curiosity-driven and Environment-grounded Synthesis Framework for Agentic RL

Teaches AI to invent its own learning games.

Technical Abstract

Beyond Fixed Tasks: Unsupervised Environment Design for Task-Level Pairs

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience