Supervised Mixture-of-Experts for Surgical Grasping and Retraction

Lorenzo Mazza; Ariel Rodriguez; Rayan Younis; Martin Lelis; Ortrun Hellig; Chenpan Li; Sebastian Bodenstedt; Martin Wagner; Stefanie Speidel

arXiv:2601.21971·cs.RO·May 12, 2026

Supervised Mixture-of-Experts for Surgical Grasping and Retraction

Lorenzo Mazza, Ariel Rodriguez, Rayan Younis, Martin Lelis, Ortrun Hellig, Chenpan Li, Sebastian Bodenstedt, Martin Wagner, Stefanie Speidel

PDF

1 Datasets

TL;DR

This paper introduces a supervised Mixture-of-Experts architecture for surgical manipulation tasks, enabling effective learning from limited data and demonstrating robust generalization and transfer capabilities in robotic surgery scenarios.

Contribution

The paper presents a novel supervised MoE architecture that enhances surgical robot learning, allowing complex manipulation from fewer demonstrations and improving robustness and transferability.

Findings

01

Supervised MoE significantly improves success rates over standard models.

02

The approach generalizes to unseen viewpoints and transfers zero-shot to ex vivo tissue.

03

It demonstrates promising preliminary results in in vivo porcine surgery.

Abstract

Imitation learning has achieved remarkable success in robotic manipulation, yet its application to surgical robotics remains challenging due to data scarcity, constrained workspaces, and the need for an exceptional level of safety and predictability. We present a supervised Mixture-of-Experts (MoE) architecture designed for phase-structured surgical manipulation tasks, which can be added on top of any autonomous policy. Unlike prior surgical robot learning approaches that rely on multi-camera setups or thousands of demonstrations, we show that a lightweight action decoder policy like Action Chunking Transformer (ACT) can learn complex, long-horizon manipulation from less than 150 demonstrations using solely stereo endoscopic images, when equipped with our architecture. We evaluate our approach on the collaborative surgical task of bowel grasping and retraction, where a robot assistant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

nct-tso/robotics_bowel_grasping
dataset· 81 dl
81 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.