Using causal abstractions to accelerate decision-making in complex bandit problems
By: Joel Dyer , Nicholas Bishop , Anisoara Calinescu and more
Potential Business Impact:
Helps computers learn faster by using different views.
Although real-world decision-making problems can often be encoded as causal multi-armed bandits (CMABs) at different levels of abstraction, a general methodology exploiting the information and computational advantages of each abstraction level is missing. In this paper, we propose AT-UCB, an algorithm which efficiently exploits shared information between CMAB problem instances defined at different levels of abstraction. More specifically, AT-UCB leverages causal abstraction (CA) theory to explore within a cheap-to-simulate and coarse-grained CMAB instance, before employing the traditional upper confidence bound (UCB) algorithm on a restricted set of potentially optimal actions in the CMAB of interest, leading to significant reductions in cumulative regret when compared to the classical UCB algorithm. We illustrate the advantages of AT-UCB theoretically, through a novel upper bound on the cumulative regret, and empirically, by applying AT-UCB to epidemiological simulators with varying resolution and computational cost.
Similar Papers
Hybrid Combinatorial Multi-armed Bandits with Probabilistically Triggered Arms
Machine Learning (CS)
Learns faster by mixing old data and new tries.
Clus-UCB: A Near-Optimal Algorithm for Clustered Bandits
Machine Learning (CS)
Helps computers learn faster by grouping similar choices.
Multi-User Contextual Cascading Bandits for Personalized Recommendation
Machine Learning (CS)
Shows ads better to many people at once.