Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning
By: Beyazit Yalcinkaya , Marcell Vazquez-Chanlatte , Ameesh Shah and more
Potential Business Impact:
Teaches robot teams to do many jobs together.
We study the problem of learning multi-task, multi-agent policies for cooperative, temporal objectives, under centralized training, decentralized execution. In this setting, using automata to represent tasks enables the decomposition of complex tasks into simpler sub-tasks that can be assigned to agents. However, existing approaches remain sample-inefficient and are limited to the single-task case. In this work, we present Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning (ACC-MARL), a framework for learning task-conditioned, decentralized team policies. We identify the main challenges to ACC-MARL's feasibility in practice, propose solutions, and prove the correctness of our approach. We further show that the value functions of learned policies can be used to assign tasks optimally at test time. Experiments show emergent task-aware, multi-step coordination among agents, e.g., pressing a button to unlock a door, holding the door, and short-circuiting tasks.
Similar Papers
Goal-Oriented Multi-Agent Reinforcement Learning for Decentralized Agent Teams
Multiagent Systems
Helps self-driving vehicles work together better.
Multi-Agent Reinforcement Learning in Intelligent Transportation Systems: A Comprehensive Survey
Machine Learning (CS)
Helps self-driving cars learn to work together.
Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments
Robotics
Teaches robot soccer teams to play better together.