Score: 0

Adaptive and Regime-Aware RL for Portfolio Optimization

Published: September 17, 2025 | arXiv ID: 2509.14385v1

By: Gabriel Nixon Raj

Potential Business Impact:

Helps computers make smarter money choices.

Business Areas:
A/B Testing Data and Analytics

This study proposes a regime-aware reinforcement learning framework for long-horizon portfolio optimization. Moving beyond traditional feedforward and GARCH-based models, we design realistic environments where agents dynamically reallocate capital in response to latent macroeconomic regime shifts. Agents receive hybrid observations and are trained using constrained reward functions that incorporate volatility penalties, capital resets, and tail-risk shocks. We benchmark multiple architectures, including PPO, LSTM-based PPO, and Transformer PPO, against classical baselines such as equal-weight and Sharpe-optimized portfolios. Our agents demonstrate robust performance under financial stress. While Transformer PPO achieves the highest risk-adjusted returns, LSTM variants offer a favorable trade-off between interpretability and training cost. The framework promotes regime-adaptive, explainable reinforcement learning for dynamic asset allocation.

Country of Origin
🇺🇸 United States

Page Count
28 pages

Category
Quantitative Finance:
Portfolio Management