Policy Optimization in Multi-Agent Settings under Partially Observable Environments
By: Ainur Zhaikhan, Malek Khammassi, Ali H. Sayed
Potential Business Impact:
Helps robots learn together faster.
This work leverages adaptive social learning to estimate partially observable global states in multi-agent reinforcement learning (MARL) problems. Unlike existing methods, the proposed approach enables the concurrent operation of social learning and reinforcement learning. Specifically, it alternates between a single step of social learning and a single step of MARL, eliminating the need for the time- and computation-intensive two-timescale learning frameworks. Theoretical guarantees are provided to support the effectiveness of the proposed method. Simulation results verify that the performance of the proposed methodology can approach that of reinforcement learning when the true state is known.
Similar Papers
Remembering the Markov Property in Cooperative MARL
Machine Learning (CS)
Teaches robots to work together by learning rules.
Networked Agents in the Dark: Team Value Learning under Partial Observability
Machine Learning (CS)
Agents learn to work together with secret info.
Evolution of Societies via Reinforcement Learning
Machine Learning (CS)
Lets many computer players learn together faster.