Score: 1

Reasoning-VLA: A Fast and General Vision-Language-Action Reasoning Model for Autonomous Driving

Published: November 25, 2025 | arXiv ID: 2511.19912v1

By: Dapeng Zhang , Zhenlong Yuan , Zhangquan Chen and more

Potential Business Impact:

Helps self-driving cars drive smarter and faster.

Business Areas:

Autonomous Vehicles Transportation

Vision-Language-Action (VLA) models have recently shown strong decision-making capabilities in autonomous driving. However, existing VLAs often struggle with achieving efficient inference and generalizing to novel autonomous vehicle configurations and driving scenarios. In this paper, we propose Reasoning-VLA, a general and fast action-generation VLA framework. The proposed model employs a set of learnable action queries, initialized via Gaussian sampling from ground-truth trajectories within the training corpus. These learnable queries interact with reasoning-enhanced vision-language features to generate continuous action trajectories in parallel. To promote robust generalization, we consolidate eight publicly available autonomous driving datasets into a standardized, Chain-of-Thought reasoning-based, and easy-to-use data format for model training. Leveraging both supervised learning and reinforcement learning fine-tuning, extensive empirical evaluations across multiple benchmarks demonstrate that Reasoning-VLA achieves state-of-the-art performance, superior generalization capability, and the excellent inference speed reported to date.

AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

CV and Pattern Recognition

Helps self-driving cars plan safer, faster trips.

16 Jun 2025 0

95%

LatentVLA: Efficient Vision-Language Models for Autonomous Driving via Latent Action Prediction

CV and Pattern Recognition

Teaches cars to drive safely in any situation.

9 Jan 2026 1

95%

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Robotics

Teaches cars to drive by watching and understanding words.

18 Dec 2025 1

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

17 pages

Reasoning-VLA: A Fast and General Vision-Language-Action Reasoning Model for Autonomous Driving

Helps self-driving cars drive smarter and faster.

Technical Abstract

AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

LatentVLA: Efficient Vision-Language Models for Autonomous Driving via Latent Action Prediction

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future