Score: 1

Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

Published: November 27, 2025 | arXiv ID: 2511.22235v1

By: Zehao Deng , Tianjie Ju , Zheng Wu and more

Potential Business Impact:

Helps computers finish long, complicated jobs.

Business Areas:

Virtual Assistant Software

The rapid development of large vision-language model (VLM) has greatly promoted the research of GUI agent. However, GUI agents still face significant challenges in handling long-horizon tasks. First, single-agent models struggle to balance high-level capabilities and low-level execution capability, facing prevalent issues of responsibility coupling and capability conflicts. Second, agents lack awareness of the task state, leading to progress loss in long-horizon tasks. To address these challenges, we propose a staged execution-feedback reinforcement learning algorithm. Unlike training a unified policy model, we focus on training high-level scheduling models. Specifically, we propose and train two agents: a Coordinator, responsible for the strategic planning and task decomposition; and a State Tracker, responsible for context compression and information management to maintain the task's state and coherence. Based on this, we built the Coordinator-Executor-State Tracker (CES) multi-agent framework, which can be integrated with any low-level Executor model, assisting the Executor in solving long-horizon tasks through task scheduling and state management. Experiments on long-horizon task benchmarks demonstrate that CES significantly enhances the system's planning and state management capabilities. Furthermore, analysis confirms that our trained high-level scheduling module is a generalizable, plug-and-play module that significantly enhances the long-horizon capabilities of various Executors. Code can be available at https://github.com/hehehahi4/CES.

Towards General Computer Control with Hierarchical Agents and Multi-Level Action Spaces

Artificial Intelligence

Lets computers control apps faster and on your device.

22 Sep 2025 1

88%

Learning to Ball: Composing Policies for Long-Horizon Basketball Moves

Graphics

Teaches robots to do many complex actions.

26 Sep 2025 1

88%

A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks

Computation and Language

Teaches AI to plan better, avoiding mistakes.

7 Oct 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com

Page Count

22 pages

Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

Helps computers finish long, complicated jobs.

Technical Abstract

Towards General Computer Control with Hierarchical Agents and Multi-Level Action Spaces

Learning to Ball: Composing Policies for Long-Horizon Basketball Moves

A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks