Dissecting Quantum Reinforcement Learning: A Systematic Evaluation of Key Components
By: Javier Lazaro, Juan-Ignacio Vazquez, Pablo Garcia-Bringas
Potential Business Impact:
Makes AI learn faster and more reliably.
Parameterised quantum circuit (PQC) based Quantum Reinforcement Learning (QRL) has emerged as a promising paradigm at the intersection of quantum computing and reinforcement learning (RL). By design, PQCs create hybrid quantum-classical models, but their practical applicability remains uncertain due to training instabilities, barren plateaus (BPs), and the difficulty of isolating the contribution of individual pipeline components. In this work, we dissect PQC based QRL architectures through a systematic experimental evaluation of three aspects recurrently identified as critical: (i) data embedding strategies, with Data Reuploading (DR) as an advanced approach; (ii) ansatz design, particularly the role of entanglement; and (iii) post-processing blocks after quantum measurement, with a focus on the underexplored Output Reuse (OR) technique. Using a unified PPO-CartPole framework, we perform controlled comparisons between hybrid and classical agents under identical conditions. Our results show that OR, though purely classical, exhibits distinct behaviour in hybrid pipelines, that DR improves trainability and stability, and that stronger entanglement can degrade optimisation, offsetting classical gains. Together, these findings provide controlled empirical evidence of the interplay between quantum and classical contributions, and establish a reproducible framework for systematic benchmarking and component-wise analysis in QRL.
Similar Papers
Hybrid Quantum-Classical Policy Gradient for Adaptive Control of Cyber-Physical Systems: A Comparative Study of VQC vs. MLP
Quantum Physics
Quantum computers learn tasks slower than regular ones.
From Classical Data to Quantum Advantage -- Quantum Policy Evaluation on Quantum Hardware
Quantum Physics
Computers learn faster by using quantum tricks.
Quantum-Efficient Reinforcement Learning Solutions for Last-Mile On-Demand Delivery
Quantum Physics
Solves delivery problems faster using quantum computers.