Score: 0

Recurrent Off-Policy Deep Reinforcement Learning Doesn't Have to be Slow

Published: December 23, 2025 | arXiv ID: 2512.20513v1

By: Tyler Clark, Christine Evers, Jonathon Hare

Recurrent off-policy deep reinforcement learning models achieve state-of-the-art performance but are often sidelined due to their high computational demands. In response, we introduce RISE (Recurrent Integration via Simplified Encodings), a novel approach that can leverage recurrent networks in any image-based off-policy RL setting without significant computational overheads via using both learnable and non-learnable encoder layers. When integrating RISE into leading non-recurrent off-policy RL algorithms, we observe a 35.6% human-normalized interquartile mean (IQM) performance improvement across the Atari benchmark. We analyze various implementation strategies to highlight the versatility and potential of our proposed framework.

PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generatio

Machine Learning (CS)

Trains AI faster and smarter using new methods.

23 Sep 2025 1

87%

Going Beyond Expert Performance via Deep Implicit Imitation Reinforcement Learning

Machine Learning (CS)

Teaches robots to learn from watching, not just doing.

5 Nov 2025 0

87%

Periodic Asynchrony: An Effective Method for Accelerating On-Policy Reinforcement Learning

Machine Learning (CS)

Makes computer learning much faster and cheaper.

24 Nov 2025 1

View PDF Login to Bookmark

Recurrent Off-Policy Deep Reinforcement Learning Doesn't Have to be Slow

Technical Abstract

PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generatio

Going Beyond Expert Performance via Deep Implicit Imitation Reinforcement Learning

Periodic Asynchrony: An Effective Method for Accelerating On-Policy Reinforcement Learning