Score: 0

Online reinforcement learning via sparse Gaussian mixture model Q-functions

Published: September 18, 2025 | arXiv ID: 2509.14585v1

By: Minh Vu, Konstantinos Slavakis

Potential Business Impact:

Teaches computers to learn faster with less data.

Business Areas:

Q&A Community and Lifestyle

This paper introduces a structured and interpretable online policy-iteration framework for reinforcement learning (RL), built around the novel class of sparse Gaussian mixture model Q-functions (S-GMM-QFs). Extending earlier work that trained GMM-QFs offline, the proposed framework develops an online scheme that leverages streaming data to encourage exploration. Model complexity is regulated through sparsification by Hadamard overparametrization, which mitigates overfitting while preserving expressiveness. The parameter space of S-GMM-QFs is naturally endowed with a Riemannian manifold structure, allowing for principled parameter updates via online gradient descent on a smooth objective. Numerical tests show that S-GMM-QFs match the performance of dense deep RL (DeepRL) methods on standard benchmarks while using significantly fewer parameters, and maintain strong performance even in low-parameter-count regimes where sparsified DeepRL methods fail to generalize.

GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies

Machine Learning (CS)

Lets robots learn complex moves safely and quickly.

2 Dec 2025 1

87%

Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function

Machine Learning (CS)

Makes AI art look better and more natural.

4 Dec 2025 1

87%

Quantum Reinforcement Learning-Guided Diffusion Model for Image Synthesis via Hybrid Quantum-Classical Generative Model Architectures

Quantum Physics

Makes AI art look better by adjusting its settings.

17 Sep 2025 0

View PDF Login to Bookmark

Page Count

5 pages

Online reinforcement learning via sparse Gaussian mixture model Q-functions

Teaches computers to learn faster with less data.

Technical Abstract

GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies

Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function

Quantum Reinforcement Learning-Guided Diffusion Model for Image Synthesis via Hybrid Quantum-Classical Generative Model Architectures