Score: 0

MARL Warehouse Robots

Published: December 4, 2025 | arXiv ID: 2512.04463v1

By: Price Allman , Lian Thang , Dre Simmons and more

Potential Business Impact:

Robots learn to work together to move packages.

Business Areas:
Machine Learning Artificial Intelligence, Data and Analytics, Software

We present a comparative study of multi-agent reinforcement learning (MARL) algorithms for cooperative warehouse robotics. We evaluate QMIX and IPPO on the Robotic Warehouse (RWARE) environment and a custom Unity 3D simulation. Our experiments reveal that QMIX's value decomposition significantly outperforms independent learning approaches (achieving 3.25 mean return vs. 0.38 for advanced IPPO), but requires extensive hyperparameter tuning -- particularly extended epsilon annealing (5M+ steps) for sparse reward discovery. We demonstrate successful deployment in Unity ML-Agents, achieving consistent package delivery after 1M training steps. While MARL shows promise for small-scale deployments (2-4 robots), significant scaling challenges remain. Code and analyses: https://pallman14.github.io/MARL-QMIX-Warehouse-Robots/

Page Count
6 pages

Category
Computer Science:
Artificial Intelligence