MIR: Efficient Exploration in Episodic Multi-Agent Reinforcement Learning via Mutual Intrinsic Reward
By: Kesheng Chen , Wenjian Luo , Bang Zhang and more
Potential Business Impact:
Helps robot teams learn to work together better.
Episodic rewards present a significant challenge in reinforcement learning. While intrinsic reward methods have demonstrated effectiveness in single-agent rein-forcement learning scenarios, their application to multi-agent reinforcement learn-ing (MARL) remains problematic. The primary difficulties stem from two fac-tors: (1) the exponential sparsity of joint action trajectories that lead to rewards as the exploration space expands, and (2) existing methods often fail to account for joint actions that can influence team states. To address these challenges, this paper introduces Mutual Intrinsic Reward (MIR), a simple yet effective enhancement strategy for MARL with extremely sparse rewards like episodic rewards. MIR incentivizes individual agents to explore actions that affect their teammates, and when combined with original strategies, effectively stimulates team exploration and improves algorithm performance. For comprehensive experimental valida-tion, we extend the representative single-agent MiniGrid environment to create MiniGrid-MA, a series of MARL environments with sparse rewards. Our evalu-ation compares the proposed method against state-of-the-art approaches in the MiniGrid-MA setting, with experimental results demonstrating superior perfor-mance.
Similar Papers
Preference-Guided Learning for Sparse-Reward Multi-Agent Reinforcement Learning
Machine Learning (CS)
Teaches robots to learn from few rewards.
Symmetry-Guided Multi-Agent Inverse Reinforcement Learnin
Robotics
Robots learn better with less practice.
LLM-Driven Intrinsic Motivation for Sparse Reward Reinforcement Learning
Machine Learning (CS)
Helps robots learn faster in tricky games.