ManipForce: Force-Guided Policy Learning with Frequency-Aware Representation for Contact-Rich Manipulation
By: Geonhyup Lee , Yeongjin Lee , Kangmin Kim and more
Potential Business Impact:
Robots learn to build things by feeling and seeing.
Contact-rich manipulation tasks such as precision assembly require precise control of interaction forces, yet existing imitation learning methods rely mainly on vision-only demonstrations. We propose ManipForce, a handheld system designed to capture high-frequency force-torque (F/T) and RGB data during natural human demonstrations for contact-rich manipulation. Building on these demonstrations, we introduce the Frequency-Aware Multimodal Transformer (FMT). FMT encodes asynchronous RGB and F/T signals using frequency- and modality-aware embeddings and fuses them via bi-directional cross-attention within a transformer diffusion policy. Through extensive experiments on six real-world contact-rich manipulation tasks - such as gear assembly, box flipping, and battery insertion - FMT trained on ManipForce demonstrations achieves robust performance with an average success rate of 83% across all tasks, substantially outperforming RGB-only baselines. Ablation and sampling-frequency analyses further confirm that incorporating high-frequency F/T data and cross-modal integration improves policy performance, especially in tasks demanding high precision and stable contact.
Similar Papers
Feel the Force: Contact-Driven Learning from Humans
Robotics
Robots learn to grip and push like humans.
Unified Multimodal Diffusion Forcing for Forceful Manipulation
Robotics
Teaches robots to learn from seeing, doing, and feeling.
FACTR: Force-Attending Curriculum Training for Contact-Rich Policy Learning
Robotics
Robots learn to feel and pick up objects.