Score: 0

Atomic Action Slicing: Planner-Aligned Options for Generalist VLA Agents

Published: December 12, 2025 | arXiv ID: 2512.11584v1

By: Stefan Tabakov , Asen Popov , Dimitar Dimitrov and more

Current vision-language-action (VLA) models generalize poorly, particularly when tasks require new compositions of skills or objects. We introduce Atomic Action Slicing (AAS), a planner-aligned approach that decomposes long-horizon demonstrations into short, typed atomic actions that are easier for planners to use and policies to learn. Using LIBERO demonstrations, AAS produces a validated dataset of 2,124 atomic segments labeled with action type, temporal span, and confidence. A stronger segmenter (Gemini 2.5 Pro) closely matches planner-defined plans and remains robust under keyframe jitter, while smaller models perform worse on multi-object tasks. Fine-tuning CLIP-RT+ on our atomic dataset improves task success from 94.2% to 95.3% on LIBERO-Goal and 83.8% to 88.8% on LIBERO-Long. We publicly release the GATE-VLAP dataset on HuggingFace(https://huggingface.co/datasets/gate-institute/GATE-VLAP-datasets)

RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics

Robotics

Teaches robots to do tasks by watching videos.

2 Apr 2025 0

88%

From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings

CV and Pattern Recognition

Teaches robots to do jobs by watching videos.

26 Nov 2025 0

88%

Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach

Robotics

Makes robots learn and do tasks better.

2 Dec 2025 1

View PDF Login to Bookmark

Atomic Action Slicing: Planner-Aligned Options for Generalist VLA Agents

Technical Abstract

RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics

From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings

Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach