CLASP: General-Purpose Clothes Manipulation with Semantic Keypoints
Yuhong Deng, Chao Tang, Cunjun Yu, Linfeng Li, David Hsu

TL;DR
CLASP introduces a semantic keypoint-based approach for versatile clothes manipulation, enabling robots to perform various tasks across diverse clothing types by combining vision language models and manipulation skills.
Contribution
It proposes a novel semantic keypoint representation for general-purpose clothes manipulation, bridging high-level planning and low-level execution across multiple tasks and clothing types.
Findings
Outperforms state-of-the-art methods in simulation across diverse clothes and tasks.
Demonstrates successful real-world application with a dual-arm robot.
Shows strong generalization capabilities in complex clothing manipulation tasks.
Abstract
Clothes manipulation, such as folding or hanging, is a critical capability for home service robots. Despite recent advances, most existing methods remain limited to specific clothes types and tasks, due to the complex, high-dimensional geometry of clothes. This paper presents CLothes mAnipulation with Semantic keyPoints (CLASP), which aims at general-purpose clothes manipulation over diverse clothes types, T-shirts, shorts, skirts, long dresses, ..., as well as different tasks, folding, flattening, hanging, .... The core idea of CLASP is semantic keypoints-e.g., ''left sleeve'' and ''right shoulder''-a sparse spatial-semantic representation, salient for both perception and action. Semantic keypoints of clothes can be reliably extracted from RGB-D images and provide an effective representation for a wide range of clothes manipulation policies. CLASP uses semantic keypoints as an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
