LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba
By: Yinuo Wang, Gavin Tao
Potential Business Impact:
Helps robots learn to move in new places faster.
We introduce LocoMamba, a vision-driven cross-modal DRL framework built on selective state-space models, specifically leveraging Mamba, that achieves near-linear-time sequence modeling, effectively captures long-range dependencies, and enables efficient training with longer sequences. First, we embed proprioceptive states with a multilayer perceptron and patchify depth images with a lightweight convolutional neural network, producing compact tokens that improve state representation. Second, stacked Mamba layers fuse these tokens via near-linear-time selective scanning, reducing latency and memory footprint, remaining robust to token length and image resolution, and providing an inductive bias that mitigates overfitting. Third, we train the policy end-to-end with Proximal Policy Optimization under terrain and appearance randomization and an obstacle-density curriculum, using a compact state-centric reward that balances progress, smoothness, and safety. We evaluate our method in challenging simulated environments with static and moving obstacles as well as uneven terrain. Compared with state-of-the-art baselines, our method achieves higher returns and success rates with fewer collisions, exhibits stronger generalization to unseen terrains and obstacle densities, and improves training efficiency by converging in fewer updates under the same compute budget.
Similar Papers
LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba
Robotics
Robots learn to move better and avoid crashes.
HuMam: Humanoid Motion Control via End-to-End Deep Reinforcement Learning with Mamba
Robotics
Makes robots walk and run more smoothly and efficiently.
Can SSD-Mamba2 Unlock Reinforcement Learning for End-to-End Motion Control?
Robotics
Robots learn to move better, faster, and safer.