Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher   Learning

Malte Mosbach; Sven Behnke

arXiv:2505.02232·cs.RO·May 6, 2025

Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher Learning

Malte Mosbach, Sven Behnke

PDF

Open Access

TL;DR

This paper introduces a memory-augmented student-teacher learning framework that combines promptable foundation models with reinforcement learning to enable robots to perform dexterous manipulation tasks based on high-level prompts, even with imperfect perception.

Contribution

It presents a novel integration of promptable perception models with reinforcement learning using memory augmentation for fine-grained control in robotics.

Findings

01

Effective prompt-responsive manipulation in cluttered scenes

02

Successful implicit state estimation from imperfect detections

03

Demonstrated dexterous object picking with high-level prompts

Abstract

Building models responsive to input prompts represents a transformative shift in machine learning. This paradigm holds significant potential for robotics problems, such as targeted manipulation amidst clutter. In this work, we present a novel approach to combine promptable foundation models with reinforcement learning (RL), enabling robots to perform dexterous manipulation tasks in a prompt-responsive manner. Existing methods struggle to link high-level commands with fine-grained dexterous control. We address this gap with a memory-augmented student-teacher learning framework. We use the Segment-Anything 2 (SAM 2) model as a perception backbone to infer an object of interest from user prompts. While detections are imperfect, their temporal sequence provides rich information for implicit state estimation by memory-augmented models. Our approach successfully learns prompt-responsive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Robotics and Automated Systems