Score: 0

Dynamic one-time delivery of critical data by small and sparse UAV swarms: a model problem for MARL scaling studies

Published: December 10, 2025 | arXiv ID: 2512.09682v1

By: Mika Persson , Jonas Lidman , Jacob Ljungberg and more

Potential Business Impact:

Drones learn to deliver packages without crashing.

Business Areas:
Drone Management Hardware, Software

This work presents a conceptual study on the application of Multi-Agent Reinforcement Learning (MARL) for decentralized control of unmanned aerial vehicles to relay a critical data package to a known position. For this purpose, a family of deterministic games is introduced, designed for scaling studies for MARL. A robust baseline policy is proposed, which is based on restricting agent motion envelopes and applying Dijkstra's algorithm. Experimental results show that two off-the-shelf MARL algorithms perform competitively with the baseline for a small number of agents, but scalability issues arise as the number of agents increase.

Country of Origin
πŸ‡ΈπŸ‡ͺ Sweden

Page Count
8 pages

Category
Electrical Engineering and Systems Science:
Systems and Control