Stochastic Shortest Path with Sparse Adversarial Costs
By: Emmeran Johnson , Alberto Rumi , Ciara Pike-Burke and more
Potential Business Impact:
Finds faster paths when few choices cost money.
We study the adversarial Stochastic Shortest Path (SSP) problem with sparse costs under full-information feedback. In the known transition setting, existing bounds based on Online Mirror Descent (OMD) with negative-entropy regularization scale with $\sqrt{\log S A}$, where $SA$ is the size of the state-action space. While we show that this is optimal in the worst-case, this bound fails to capture the benefits of sparsity when only a small number $M \ll SA$ of state-action pairs incur cost. In fact, we also show that the negative-entropy is inherently non-adaptive to sparsity: it provably incurs regret scaling with $\sqrt{\log S}$ on sparse problems. Instead, we propose a family of $\ell_r$-norm regularizers ($r \in (1,2)$) that adapts to the sparsity and achieves regret scaling with $\sqrt{\log M}$ instead of $\sqrt{\log SA}$. We show this is optimal via a matching lower bound, highlighting that $M$ captures the effective dimension of the problem instead of $SA$. Finally, in the unknown transition setting the benefits of sparsity are limited: we prove that even on sparse problems, the minimax regret for any learner scales polynomially with $SA$.
Similar Papers
Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems
Machine Learning (CS)
Helps robots work together to finish tasks.
Regret Guarantees for Linear Contextual Stochastic Shortest Path
Machine Learning (CS)
Teaches computers to win games with hidden rules.
The Hidden Cost of Approximation in Online Mirror Descent
Machine Learning (CS)
Makes computer learning more accurate with bad data.