Score: 1

Informed Asymmetric Actor-Critic: Leveraging Privileged Signals Beyond Full-State Access

Published: September 30, 2025 | arXiv ID: 2509.26000v1

By: Daniel Ebi , Gaspard Lambrechts , Damien Ernst and more

Potential Business Impact:

Helps robots learn better with hidden clues.

Business Areas:

Artificial Intelligence Artificial Intelligence, Data and Analytics, Science and Engineering, Software

Reinforcement learning in partially observable environments requires agents to act under uncertainty from noisy, incomplete observations. Asymmetric actor-critic methods leverage privileged information during training to improve learning under these conditions. However, existing approaches typically assume full-state access during training. In this work, we challenge this assumption by proposing a novel actor-critic framework, called informed asymmetric actor-critic, that enables conditioning the critic on arbitrary privileged signals without requiring access to the full state. We show that policy gradients remain unbiased under this formulation, extending the theoretical foundation of asymmetric methods to the more general case of privileged partial information. To quantify the impact of such signals, we propose informativeness measures based on kernel methods and return prediction error, providing practical tools for evaluating training-time signals. We validate our approach empirically on benchmark navigation tasks and synthetic partially observable environments, showing that our informed asymmetric method improves learning efficiency and value estimation when informative privileged inputs are available. Our findings challenge the necessity of full-state access and open new directions for designing asymmetric reinforcement learning methods that are both practical and theoretically sound.

PIGDreamer: Privileged Information Guided World Models for Safe Partially Observable Reinforcement Learning

Machine Learning (CS)

Teaches robots to be safe and smart.

4 Aug 2025 0

87%

To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable Reinforcement Learning

Machine Learning (CS)

Teaches robots to learn better from hidden information.

3 Oct 2025 1

87%

Optimistic critics can empower small actors

Machine Learning (CS)

Makes AI learn better by changing how it learns.

1 Jun 2025 1

View PDF Login to Bookmark

Country of Origin

🇩🇪 🇧🇪 Germany, Belgium

Page Count

21 pages

Informed Asymmetric Actor-Critic: Leveraging Privileged Signals Beyond Full-State Access

Helps robots learn better with hidden clues.

Technical Abstract

PIGDreamer: Privileged Information Guided World Models for Safe Partially Observable Reinforcement Learning

To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable Reinforcement Learning

Optimistic critics can empower small actors