Online Algorithms for Repeated Optimal Stopping: Achieving Both Competitive Ratio and Regret Bounds
By: Tsubasa Harada, Yasushi Kawase, Hanna Sumita
Potential Business Impact:
Helps make better choices when repeating decisions.
We study the repeated optimal stopping problem, which generalizes the classical optimal stopping problem with an unknown distribution to a setting where the same problem is solved repeatedly over $T$ rounds. In this framework, we aim to design algorithms that guarantee a competitive ratio in each round while also achieving sublinear regret across all rounds. Our primary contribution is a general algorithmic framework that achieves these objectives simultaneously for a wide array of repeated optimal stopping problems. The core idea is to dynamically select an algorithm for each round, choosing between two candidates: (1) an empirically optimal algorithm derived from the history of observations, and (2) a sample-based algorithm with a proven competitive ratio guarantee. Based on this approach, we design an algorithm that performs no worse than the baseline sample-based algorithm in every round, while ensuring that the total regret is bounded by $\tilde{O}(\sqrt{T})$. We demonstrate the broad applicability of our framework to canonical problems, including the prophet inequality, the secretary problem, and their variants under adversarial, random, and i.i.d. input models. For example, for the repeated prophet inequality problem, our method achieves a $1/2$-competitive ratio from the second round on and an $\tilde{O}(\sqrt{T})$ regret. Furthermore, we establish a regret lower bound of $\Omega(\sqrt{T})$ even in the i.i.d. model, confirming that our algorithm's performance is almost optimal with respect to the number of rounds.
Similar Papers
Prophet and Secretary at the Same Time
Data Structures and Algorithms
Helps computers make good choices with uncertain information.
Optimal Stopping with a Predicted Prior
Data Structures and Algorithms
Helps computers make better choices with uncertain information.
Learning to Optimally Stop Diffusion Processes, with Financial Applications
Optimization and Control
Helps computers learn to make smart money choices.