MetaFold: Language-Guided Multi-Category Garment Folding Framework via Trajectory Generation and Foundation Model
By: Haonan Chen , Junxiao Li , Ruihai Wu and more
Potential Business Impact:
Teaches robots to fold clothes from any instruction.
Garment folding is a common yet challenging task in robotic manipulation. The deformability of garments leads to a vast state space and complex dynamics, which complicates precise and fine-grained manipulation. Previous approaches often rely on predefined key points or demonstrations, limiting their generalization across diverse garment categories. This paper presents a framework, MetaFold, that disentangles task planning from action prediction, learning each independently to enhance model generalization. It employs language-guided point cloud trajectory generation for task planning and a low-level foundation model for action prediction. This structure facilitates multi-category learning, enabling the model to adapt flexibly to various user instructions and folding tasks. Experimental results demonstrate the superiority of our proposed framework. Supplementary materials are available on our website: https://meta-fold.github.io/.
Similar Papers
FoldNet: Learning Generalizable Closed-Loop Policy for Garment Folding via Keypoint-Driven Asset and Demonstration Synthesis
Robotics
Teaches robots to fold clothes perfectly.
Beyond Static Perception: Integrating Temporal Context into VLMs for Cloth Folding
Robotics
Helps robots fold clothes by seeing and remembering.
Learning a General Model: Folding Clothing with Topological Dynamics
Robotics
Teaches robots to fold messy clothes.