Score: 0

Incremental Policy Iteration for Unknown Nonlinear Systems with Stability and Performance Guarantees

Published: August 29, 2025 | arXiv ID: 2508.21367v1

By: Qingkai Meng, Fenglan Wang, Lin Zhao

Potential Business Impact:

Teaches robots to learn and control themselves.

Business Areas:

Predictive Analytics Artificial Intelligence, Data and Analytics, Software

This paper proposes a general incremental policy iteration adaptive dynamic programming (ADP) algorithm for model-free robust optimal control of unknown nonlinear systems. The approach integrates recursive least squares estimation with linear ADP principles, which greatly simplifies the implementation while preserving adaptive learning capabilities. In particular, we develop a sufficient condition for selecting a discount factor such that it allows learning the optimal policy starting with an initial policy that is not necessarily stabilizing. Moreover, we characterize the robust stability of the closed-loop system and the near-optimality of iterative policies. Finally, we perform numerical simulations to demonstrate the effectiveness of the proposed method.

Rollout-Based Approximate Dynamic Programming for MDPs with Information-Theoretic Constraints

Systems and Control

Helps computers make better choices with less data.

2 Sep 2025 0

88%

Data-Driven Yet Formal Policy Synthesis for Stochastic Nonlinear Dynamical Systems

Systems and Control

Teaches robots to control tricky machines reliably.

2 Jan 2025 1

88%

Robustness of Online Identification-based Policy Iteration to Noisy Data

Systems and Control

Teaches robots to learn and improve tasks.

10 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇸🇬 Singapore

Page Count

7 pages

Incremental Policy Iteration for Unknown Nonlinear Systems with Stability and Performance Guarantees

Teaches robots to learn and control themselves.

Technical Abstract

Rollout-Based Approximate Dynamic Programming for MDPs with Information-Theoretic Constraints

Data-Driven Yet Formal Policy Synthesis for Stochastic Nonlinear Dynamical Systems

Robustness of Online Identification-based Policy Iteration to Noisy Data