Score: 0

On the Convergence of the Policy Iteration for Infinite-Horizon Nonlinear Optimal Control Problems

Published: July 14, 2025 | arXiv ID: 2507.09994v1

By: Tobias Ehring, Behzad Azmi, Bernard Haasdonk

Potential Business Impact:

Makes robots learn better and faster.

Business Areas:
Innovation Management Professional Services

Policy iteration (PI) is a widely used algorithm for synthesizing optimal feedback control policies across many engineering and scientific applications. When PI is deployed on infinite-horizon, nonlinear, autonomous optimal-control problems, however, a number of significant theoretical challenges emerge - particularly when the computational state space is restricted to a bounded domain. In this paper, we investigate these challenges and show that the viability of PI in this setting hinges on the existence, uniqueness, and regularity of solutions to the Generalized Hamilton-Jacobi-Bellman (GHJB) equation solved at each iteration. To ensure a well-posed iterative scheme, the GHJB solution must possess sufficient smoothness, and the domain on which the GHJB equation is solved must remain forward-invariant under the closed-loop dynamics induced by the current policy. Although fundamental to the method's convergence, previous studies have largely overlooked these aspects. This paper closes that gap by introducing a constructive procedure that guarantees forward invariance of the computational domain throughout the entire PI sequence and by establishing sufficient conditions under which a suitably regular GHJB solution exists at every iteration. Numerical results are presented for a grid-based implementation of PI to support the theoretical findings.

Country of Origin
🇩🇪 Germany

Page Count
35 pages

Category
Mathematics:
Optimization and Control