Independent Learning in Performative Markov Potential Games
By: Rilind Sahitaj , Paulius Sasnauskas , Yiğit Yalın and more
Potential Business Impact:
Makes AI agents learn better when they change the game.
Performative Reinforcement Learning (PRL) refers to a scenario in which the deployed policy changes the reward and transition dynamics of the underlying environment. In this work, we study multi-agent PRL by incorporating performative effects into Markov Potential Games (MPGs). We introduce the notion of a performatively stable equilibrium (PSE) and show that it always exists under a reasonable sensitivity assumption. We then provide convergence results for state-of-the-art algorithms used to solve MPGs. Specifically, we show that independent policy gradient ascent (IPGA) and independent natural policy gradient (INPG) converge to an approximate PSE in the best-iterate sense, with an additional term that accounts for the performative effects. Furthermore, we show that INPG asymptotically converges to a PSE in the last-iterate sense. As the performative effects vanish, we recover the convergence rates from prior work. For a special case of our game, we provide finite-time last-iterate convergence results for a repeated retraining approach, in which agents independently optimize a surrogate objective. We conduct extensive experiments to validate our theoretical findings.
Similar Papers
On Corruption-Robustness in Performative Reinforcement Learning
Machine Learning (CS)
Makes AI learn safely even with bad information.
Aspiration-based Perturbed Learning Automata in Games with Noisy Utility Measurements. Part B: Stochastic Stability in Weakly Acyclic Games
CS and Game Theory
Helps game players learn to win together.
Enhancing Agentic RL with Progressive Reward Shaping and Value-based Sampling Policy Optimization
Computation and Language
Helps AI learn to solve harder problems faster.