Wasserstein-Barycenter Consensus for Cooperative Multi-Agent Reinforcement Learning
By: Ali Baheri
Potential Business Impact:
Teaches robots to work together better.
Cooperative multi-agent reinforcement learning (MARL) demands principled mechanisms to align heterogeneous policies while preserving the capacity for specialized behavior. We introduce a novel consensus framework that defines the team strategy as the entropic-regularized $p$-Wasserstein barycenter of agents' joint state--action visitation measures. By augmenting each agent's policy objective with a soft penalty proportional to its Sinkhorn divergence from this barycenter, the proposed approach encourages coherent group behavior without enforcing rigid parameter sharing. We derive an algorithm that alternates between Sinkhorn-barycenter computation and policy-gradient updates, and we prove that, under standard Lipschitz and compactness assumptions, the maximal pairwise policy discrepancy contracts at a geometric rate. Empirical evaluation on a cooperative navigation case study demonstrates that our OT-barycenter consensus outperforms an independent learners baseline in convergence speed and final coordination success.
Similar Papers
Wasserstein Barycenter Soft Actor-Critic
Machine Learning (CS)
Teaches robots to learn faster with less practice.
Collaborative Bayesian Optimization via Wasserstein Barycenters
Machine Learning (CS)
Helps computers learn secrets without sharing data.
Heterogeneous Federated Reinforcement Learning Using Wasserstein Barycenters
Machine Learning (CS)
Teaches AI to learn from many separate computers.