LEMMo-Plan: LLM-Enhanced Learning from Multi-Modal Demonstration for Planning Sequential Contact-Rich Manipulation Tasks
Kejia Chen, Zheng Shen, Yue Zhang, Lingyun Chen, Fan Wu, Zhenshan, Bing, Sami Haddadin, Alois Knoll

TL;DR
This paper introduces LEMMo-Plan, a framework that enhances large language models' ability to plan contact-rich manipulation tasks by integrating tactile and force data from demonstrations, leading to more accurate and effective robotic planning.
Contribution
It presents a novel multi-modal in-context learning approach that combines visual, tactile, and force data to improve LLM-based task planning for complex manipulation tasks.
Findings
Improved planning accuracy in real-world manipulation tasks.
Effective integration of tactile and force data with visual demonstrations.
Enhanced LLM understanding of contact-rich interactions.
Abstract
Large Language Models (LLMs) have gained popularity in task planning for long-horizon manipulation tasks. To enhance the validity of LLM-generated plans, visual demonstrations and online videos have been widely employed to guide the planning process. However, for manipulation tasks involving subtle movements but rich contact interactions, visual perception alone may be insufficient for the LLM to fully interpret the demonstration. Additionally, visual data provides limited information on force-related parameters and conditions, which are crucial for effective execution on real robots. In this paper, we introduce an in-context learning framework that incorporates tactile and force-torque information from human demonstrations to enhance LLMs' ability to generate plans for new task scenarios. We propose a bootstrapped reasoning pipeline that sequentially integrates each modality into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Software Engineering Research · Teleoperation and Haptic Systems
