Score: 1

Can SSD-Mamba2 Unlock Reinforcement Learning for End-to-End Motion Control?

Published: September 9, 2025 | arXiv ID: 2509.07593v1

By: Gavin Tao, Yinuo Wang, Jinzhao Zhou

Potential Business Impact:

Robots learn to move better, faster, and safer.

Business Areas:

Embedded Systems Hardware, Science and Engineering, Software

End-to-end reinforcement learning for motion control promises unified perception-action policies that scale across embodiments and tasks, yet most deployed controllers are either blind (proprioception-only) or rely on fusion backbones with unfavorable compute-memory trade-offs. Recurrent controllers struggle with long-horizon credit assignment, and Transformer-based fusion incurs quadratic cost in token length, limiting temporal and spatial context. We present a vision-driven cross-modal RL framework built on SSD-Mamba2, a selective state-space backbone that applies state-space duality (SSD) to enable both recurrent and convolutional scanning with hardware-aware streaming and near-linear scaling. Proprioceptive states and exteroceptive observations (e.g., depth tokens) are encoded into compact tokens and fused by stacked SSD-Mamba2 layers. The selective state-space updates retain long-range dependencies with markedly lower latency and memory use than quadratic self-attention, enabling longer look-ahead, higher token resolution, and stable training under limited compute. Policies are trained end-to-end under curricula that randomize terrain and appearance and progressively increase scene complexity. A compact, state-centric reward balances task progress, energy efficiency, and safety. Across diverse motion-control scenarios, our approach consistently surpasses strong state-of-the-art baselines in return, safety (collisions and falls), and sample efficiency, while converging faster at the same compute budget. These results suggest that SSD-Mamba2 provides a practical fusion backbone for scalable, foresightful, and efficient end-to-end motion control.

LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba

Robotics

Helps robots learn to move in new places faster.

16 Aug 2025 2

90%

LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba

Robotics

Robots learn to move better and avoid crashes.

16 Aug 2025 1

89%

Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM

CV and Pattern Recognition

Helps self-driving cars predict where others will go.

13 Mar 2025 2

View PDF Login to Bookmark

Page Count

9 pages

Can SSD-Mamba2 Unlock Reinforcement Learning for End-to-End Motion Control?

Robots learn to move better, faster, and safer.

Technical Abstract

LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba

LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba

Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM