Score: 0

Geometry-Inspired Unified Framework for Discounted and Average Reward MDPs

Published: October 27, 2025 | arXiv ID: 2510.23914v1

By: Arsenii Mustafin, Xinyi Sheng, Dominik Baumann

Potential Business Impact:

Unifies math for better computer learning.

Business Areas:
A/B Testing Data and Analytics

The theoretical analysis of Markov Decision Processes (MDPs) is commonly split into two cases - the average-reward case and the discounted-reward case - which, while sharing similarities, are typically analyzed separately. In this work, we extend a recently introduced geometric interpretation of MDPs for the discounted-reward case to the average-reward case, thereby unifying both. This allows us to extend a major result known for the discounted-reward case to the average-reward case: under a unique and ergodic optimal policy, the Value Iteration algorithm achieves a geometric convergence rate.

Country of Origin
🇫🇮 Finland

Page Count
12 pages

Category
Computer Science:
Machine Learning (CS)