Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
By: Indrajit Kar, Kalathur Chenchu Kishore Kumar
Large Language Models and multi-agent systems have shown promise in decomposing complex tasks, yet they struggle with long-horizon reasoning tasks and escalating computation cost. This work introduces a hierarchical multi-agent architecture that distributes reasoning across a 64*64 grid of lightweight agents, supported by a selective oracle. A spatial curriculum progressively expands the operational region of the grid, ensuring that agents master easier central tasks before tackling harder peripheral ones. To improve reliability, the system integrates Negative Log-Likelihood as a measure of confidence, allowing the curriculum to prioritize regions where agents are both accurate and well calibrated. A Thompson Sampling curriculum manager adaptively chooses training zones based on competence and NLL-driven reward signals. We evaluate the approach on a spatially grounded Tower of Hanoi benchmark, which mirrors the long-horizon structure of many robotic manipulation and planning tasks. Results demonstrate improved stability, reduced oracle usage, and stronger long-range reasoning from distributed agent cooperation.
Similar Papers
Analyzing Information Sharing and Coordination in Multi-Agent Planning
Computation and Language
Helps AI plan complex trips with fewer mistakes.
Decentralized Multi-Agent Goal Assignment for Path Planning using Large Language Models
Artificial Intelligence
Lets robots work together without talking.
Language-Driven Hierarchical Task Structures as Explicit World Models for Multi-Agent Learning
Artificial Intelligence
Teaches robots to play soccer by explaining rules.