Score: 0

Scalable Offline Model-Based RL with Action Chunks

Published: December 8, 2025 | arXiv ID: 2512.08108v1

By: Kwanyoung Park , Seohong Park , Youngwoon Lee and more

Potential Business Impact:

Teaches computers to learn better from past experiences.

Business Areas:

Machine Learning Artificial Intelligence, Data and Analytics, Software

In this paper, we study whether model-based reinforcement learning (RL), in particular model-based value expansion, can provide a scalable recipe for tackling complex, long-horizon tasks in offline RL. Model-based value expansion fits an on-policy value function using length-n imaginary rollouts generated by the current policy and a learned dynamics model. While larger n reduces bias in value bootstrapping, it amplifies accumulated model errors over long horizons, degrading future predictions. We address this trade-off with an \emph{action-chunk} model that predicts a future state from a sequence of actions (an "action chunk") instead of a single action, which reduces compounding errors. In addition, instead of directly training a policy to maximize rewards, we employ rejection sampling from an expressive behavioral action-chunk policy, which prevents model exploitation from out-of-distribution actions. We call this recipe \textbf{Model-Based RL with Action Chunks (MAC)}. Through experiments on highly challenging tasks with large-scale datasets of up to 100M transitions, we show that MAC achieves the best performance among offline model-based RL algorithms, especially on challenging long-horizon tasks.

CO-RFT: Efficient Fine-Tuning of Vision-Language-Action Models through Chunked Offline Reinforcement Learning

Robotics

Teaches robots new tasks with just a few examples.

4 Aug 2025 1

88%

Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward

Robotics

Teaches robots to do complex jobs faster.

15 Aug 2025 0

87%

Sample-Efficient Tabular Self-Play for Offline Robust Reinforcement Learning

Machine Learning (CS)

Teaches AI to win games even with unknown rules.

29 Nov 2025 2

View PDF Login to Bookmark

Page Count

22 pages

Scalable Offline Model-Based RL with Action Chunks

Teaches computers to learn better from past experiences.

Technical Abstract

CO-RFT: Efficient Fine-Tuning of Vision-Language-Action Models through Chunked Offline Reinforcement Learning

Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward

Sample-Efficient Tabular Self-Play for Offline Robust Reinforcement Learning