Online reinforcement learning via sparse Gaussian mixture model Q-functions
By: Minh Vu, Konstantinos Slavakis
Potential Business Impact:
Teaches computers to learn faster with less data.
This paper introduces a structured and interpretable online policy-iteration framework for reinforcement learning (RL), built around the novel class of sparse Gaussian mixture model Q-functions (S-GMM-QFs). Extending earlier work that trained GMM-QFs offline, the proposed framework develops an online scheme that leverages streaming data to encourage exploration. Model complexity is regulated through sparsification by Hadamard overparametrization, which mitigates overfitting while preserving expressiveness. The parameter space of S-GMM-QFs is naturally endowed with a Riemannian manifold structure, allowing for principled parameter updates via online gradient descent on a smooth objective. Numerical tests show that S-GMM-QFs match the performance of dense deep RL (DeepRL) methods on standard benchmarks while using significantly fewer parameters, and maintain strong performance even in low-parameter-count regimes where sparsified DeepRL methods fail to generalize.
Similar Papers
GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies
Machine Learning (CS)
Lets robots learn complex moves safely and quickly.
Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function
Machine Learning (CS)
Makes AI art look better and more natural.
Quantum Reinforcement Learning-Guided Diffusion Model for Image Synthesis via Hybrid Quantum-Classical Generative Model Architectures
Quantum Physics
Makes AI art look better by adjusting its settings.