Score: 0

The Exploratory Multi-Asset Mean-Variance Portfolio Selection using Reinforcement Learning

Published: May 12, 2025 | arXiv ID: 2505.07537v1

By: Yu Li, Yuhan Wu, Shuhua Zhang

Potential Business Impact:

Helps computers pick the best stocks to buy.

Business Areas:

A/B Testing Data and Analytics

In this paper, we study the continuous-time multi-asset mean-variance (MV) portfolio selection using a reinforcement learning (RL) algorithm, specifically the soft actor-critic (SAC) algorithm, in the time-varying financial market. A family of Gaussian portfolio selections is derived, and a policy iteration process is crafted to learn the optimal exploratory portfolio selection. We prove the convergence of the policy iteration process theoretically, based on which the SAC algorithm is developed. To improve the algorithm's stability and the learning accuracy in the multi-asset scenario, we divide the model parameters that influence the optimal portfolio selection into three parts, and learn each part progressively. Numerical studies in the simulated and real financial markets confirm the superior performance of the proposed SAC algorithm under various criteria.

Exploratory Mean-Variance with Jumps: An Equilibrium Approach

Portfolio Management

Helps investors make more money in the stock market.

10 Dec 2025 0

90%

Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study

Portfolio Management

Helps investors pick winning stocks automatically.

8 Dec 2024 0

90%

Cryptocurrency Portfolio Management with Reinforcement Learning: Soft Actor--Critic and Deep Deterministic Policy Gradient Algorithms

Computational Finance

Helps computers make smart money choices in crypto.

16 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

35 pages

The Exploratory Multi-Asset Mean-Variance Portfolio Selection using Reinforcement Learning

Helps computers pick the best stocks to buy.

Technical Abstract

Exploratory Mean-Variance with Jumps: An Equilibrium Approach

Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study

Cryptocurrency Portfolio Management with Reinforcement Learning: Soft Actor--Critic and Deep Deterministic Policy Gradient Algorithms