Leveraging Foundation Models To learn the shape of semi-fluid deformable   objects

Omar El Assal (VIBOT; ImViA; Alstom Transport); Carlos M. Mateo (ICB),; Sebastien Ciron (Alstom Transport); David Fofi (VIBOT; ImViA)

arXiv:2411.16802·cs.RO·November 27, 2024

Leveraging Foundation Models To learn the shape of semi-fluid deformable objects

Omar El Assal (VIBOT, ImViA, Alstom Transport), Carlos M. Mateo (ICB),, Sebastien Ciron (Alstom Transport), David Fofi (VIBOT, ImViA)

PDF

Open Access

TL;DR

This paper explores using foundation models to characterize semi-fluid deformable objects, enabling keypoint detection without extensive labeled datasets, and demonstrates promising results in deformable object manipulation.

Contribution

It introduces a novel approach leveraging foundation models as teachers for deformable object characterization, eliminating the need for pre-training and labeled data.

Findings

01

Student network achieved 13.4 pixel keypoint error.

02

Teacher model attained 75.26% mIoU in object mask retrieval.

03

Knowledge distillation improved deformable object characterization.

Abstract

One of the difficulties imposed on the manipulation of deformable objects is their characterization and the detection of representative keypoints for the purpose of manipulation. A keen interest was manifested by researchers in the last decade to characterize and manipulate deformable objects of non-fluid nature, such as clothes and ropes. Even though several propositions were made in the regard of object characterization, however researchers were always confronted with the need of pixel-level information of the object through images to extract relevant information. This usually is accomplished by means of segmentation networks trained on manually labeled data for this purpose. In this paper, we address the subject of characterizing weld pool to define stable features that serve as information for further motion control objectives. We achieve this by employing different pipelines. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction

MethodsKnowledge Distillation