PADiff: Predictive and Adaptive Diffusion Policies for Ad Hoc Teamwork
By: Hohei Chan , Xinzhi Zhang , Antao Xiang and more
Potential Business Impact:
Helps robots learn to work together instantly.
Ad hoc teamwork (AHT) requires agents to collaborate with previously unseen teammates, which is crucial for many real-world applications. The core challenge of AHT is to develop an ego agent that can predict and adapt to unknown teammates on the fly. Conventional RL-based approaches optimize a single expected return, which often causes policies to collapse into a single dominant behavior, thus failing to capture the multimodal cooperation patterns inherent in AHT. In this work, we introduce PADiff, a diffusion-based approach that captures agent's multimodal behaviors, unlocking its diverse cooperation modes with teammates. However, standard diffusion models lack the ability to predict and adapt in highly non-stationary AHT scenarios. To address this limitation, we propose a novel diffusion-based policy that integrates critical predictive information about teammates into the denoising process. Extensive experiments across three cooperation environments demonstrate that PADiff outperforms existing AHT methods significantly.
Similar Papers
ADPro: a Test-time Adaptive Diffusion Policy for Robot Manipulation via Manifold and Initial Noise Constraints
Robotics
Robots learn to do tasks faster and better.
Zero-Shot Coordination in Ad Hoc Teams with Generalized Policy Improvement and Difference Rewards
Multiagent Systems
Robots learn to work together instantly with new friends.
AID: Agent Intent from Diffusion for Multi-Agent Informative Path Planning
Robotics
Helps robots find more information faster together.