Integrated Control and Active Perception in POMDPs for Temporal Logic Tasks and Information Acquisition
By: Chongyang Shi, Michael R. Dorothy, Jie Fu
Potential Business Impact:
Helps drones find secrets by watching and moving.
This paper studies the synthesis of a joint control and active perception policy for a stochastic system modeled as a partially observable Markov decision process (POMDP), subject to temporal logic specifications. The POMDP actions influence both system dynamics (control) and the emission function (perception). Beyond task completion, the planner seeks to maximize information gain about certain temporal events (the secret) through coordinated perception and control. To enable active information acquisition, we introduce minimizing the Shannon conditional entropy of the secret as a planning objective, alongside maximizing the probability of satisfying the temporal logic formula within a finite horizon. Using a variant of observable operators in hidden Markov models (HMMs) and POMDPs, we establish key properties of the conditional entropy gradient with respect to policy parameters. These properties facilitate efficient policy gradient computation. We validate our approach through graph-based examples, inspired by common security applications with UAV surveillance.
Similar Papers
Learning Symbolic Persistent Macro-Actions for POMDP Solving Over Time
Artificial Intelligence
Teaches robots to make smart choices faster.
Control Synthesis in Partially Observable Environments for Complex Perception-Related Objectives
Systems and Control
Helps robots learn to do tasks with incomplete information.
IMAS$^2$: Joint Agent Selection and Information-Theoretic Coordinated Perception In Dec-POMDPs
Systems and Control
Helps robots work together to see better.