Score: 0

Evader-Agnostic Team-Based Pursuit Strategies in Partially-Observable Environments

Published: November 8, 2025 | arXiv ID: 2511.05812v1

By: Addison Kalanther , Daniel Bostwick , Chinmay Maheshwari and more

Potential Business Impact:

Drones learn to find and catch hidden flying targets.

Business Areas:

Autonomous Vehicles Transportation

We consider a scenario where a team of two unmanned aerial vehicles (UAVs) pursue an evader UAV within an urban environment. Each agent has a limited view of their environment where buildings can occlude their field-of-view. Additionally, the pursuer team is agnostic about the evader in terms of its initial and final location, and the behavior of the evader. Consequently, the team needs to gather information by searching the environment and then track it to eventually intercept. To solve this multi-player, partially-observable, pursuit-evasion game, we develop a two-phase neuro-symbolic algorithm centered around the principle of bounded rationality. First, we devise an offline approach using deep reinforcement learning to progressively train adversarial policies for the pursuer team against fictitious evaders. This creates $k$-levels of rationality for each agent in preparation for the online phase. Then, we employ an online classification algorithm to determine a "best guess" of our current opponent from the set of iteratively-trained strategic agents and apply the best player response. Using this schema, we improved average performance when facing a random evader in our environment.