Score: 0

Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems

Published: November 6, 2025 | arXiv ID: 2511.04594v1

By: Utkarsh U. Chavan, Prashant Trivedi, Nandyala Hemachandra

Potential Business Impact:

Helps robots work together to finish tasks.

Business Areas:

Intelligent Systems Artificial Intelligence, Data and Analytics, Science and Engineering

Multi-agent systems (MAS) are central to applications such as swarm robotics and traffic routing, where agents must coordinate in a decentralized manner to achieve a common objective. Stochastic Shortest Path (SSP) problems provide a natural framework for modeling decentralized control in such settings. While the problem of learning in SSP has been extensively studied in single-agent settings, the decentralized multi-agent variant remains largely unexplored. In this work, we take a step towards addressing that gap. We study decentralized multi-agent SSPs (Dec-MASSPs) under linear function approximation, where the transition dynamics and costs are represented using linear models. Applying novel symmetry-based arguments, we identify the structure of optimal policies. Our main contribution is the first regret lower bound for this setting based on the construction of hard-to-learn instances for any number of agents, $n$. Our regret lower bound of $\Omega(\sqrt{K})$, over $K$ episodes, highlights the inherent learning difficulty in Dec-MASSPs. These insights clarify the learning complexity of decentralized control and can further guide the design of efficient learning algorithms in multi-agent systems.