Score: 0

Multi-agent Robust and Optimal Policy Learning for Data Harvesting

Published: August 22, 2025 | arXiv ID: 2508.16490v1

By: Shili Wu , Yancheng Zhu , Aniruddha Datta and more

Potential Business Impact:

Drones collect sensor data faster and smarter.

Business Areas:

Smart Cities Real Estate

We consider the problem of using multiple agents to harvest data from a collection of sensor nodes (targets) scattered across a two-dimensional environment. These targets transmit their data to the agents that move in the space above them, and our goal is for the agents to collect data from the targets as efficiently as possible while moving to their final destinations. The agents are assumed to have a continuous control action, and we leverage reinforcement learning, specifically Proximal Policy Optimization (PPO) with Lagrangian Penalty (LP), to identify highly effective solutions. Additionally, we enhance the controller's robustness by incorporating regularization at each state to smooth the learned policy. We conduct a series of simulations to demonstrate our approach and validate its performance and robustness.

Learning to Lead Themselves: Agentic AI in MAS using MARL

Artificial Intelligence

Drones work together to deliver packages faster.

24 Sep 2025 0

89%

Heterogeneous Multi-Agent Proximal Policy Optimization for Power Distribution System Restoration

Artificial Intelligence

Fixes power grids faster after blackouts.

18 Nov 2025 0

88%

Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets

Machine Learning (CS)

Guides many robots to herd moving things.

3 Apr 2025 1

View PDF Login to Bookmark

Page Count

6 pages

Multi-agent Robust and Optimal Policy Learning for Data Harvesting

Drones collect sensor data faster and smarter.

Technical Abstract

Learning to Lead Themselves: Agentic AI in MAS using MARL

Heterogeneous Multi-Agent Proximal Policy Optimization for Power Distribution System Restoration

Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets