Score: 0

MetaFold: Language-Guided Multi-Category Garment Folding Framework via Trajectory Generation and Foundation Model

Published: March 11, 2025 | arXiv ID: 2503.08372v2

By: Haonan Chen , Junxiao Li , Ruihai Wu and more

Potential Business Impact:

Teaches robots to fold clothes from any instruction.

Business Areas:
Motion Capture Media and Entertainment, Video

Garment folding is a common yet challenging task in robotic manipulation. The deformability of garments leads to a vast state space and complex dynamics, which complicates precise and fine-grained manipulation. Previous approaches often rely on predefined key points or demonstrations, limiting their generalization across diverse garment categories. This paper presents a framework, MetaFold, that disentangles task planning from action prediction, learning each independently to enhance model generalization. It employs language-guided point cloud trajectory generation for task planning and a low-level foundation model for action prediction. This structure facilitates multi-category learning, enabling the model to adapt flexibly to various user instructions and folding tasks. Experimental results demonstrate the superiority of our proposed framework. Supplementary materials are available on our website: https://meta-fold.github.io/.

Page Count
8 pages

Category
Computer Science:
Robotics