Value Under Ignorance in Universal Artificial Intelligence
By: Cole Wyeth, Marcus Hutter
We generalize the AIXI reinforcement learning agent to admit a wider class of utility functions. Assigning a utility to each possible interaction history forces us to confront the ambiguity that some hypotheses in the agent's belief distribution only predict a finite prefix of the history, which is sometimes interpreted as implying a chance of death equal to a quantity called the semimeasure loss. This death interpretation suggests one way to assign utilities to such history prefixes. We argue that it is as natural to view the belief distributions as imprecise probability distributions, with the semimeasure loss as total ignorance. This motivates us to consider the consequences of computing expected utilities with Choquet integrals from imprecise probability theory, including an investigation of their computability level. We recover the standard recursive value function as a special case. However, our most general expected utilities under the death interpretation cannot be characterized as such Choquet integrals.
Similar Papers
A Novel Indicator for Quantifying and Minimizing Information Utility Loss of Robot Teams
Distributed, Parallel, and Cluster Computing
Robots share information faster, working together better.
Embedded Universal Predictive Intelligence: a coherent framework for multi-agent learning
Artificial Intelligence
Helps AI agents learn to work together better.
The Suicide Region: Option Games and the Race to Artificial General Intelligence
Risk Management
Forces AI race to speed up, risking disaster.