Unsupervised Learning for Surgical Motion by Learning to Predict the   Future

Robert DiPietro; Gregory D. Hager

arXiv:1806.03318·cs.CV·June 12, 2018

Unsupervised Learning for Surgical Motion by Learning to Predict the Future

Robert DiPietro, Gregory D. Hager

PDF

TL;DR

This paper introduces an unsupervised learning approach using an RNN and mixture density networks to predict future surgical motion, enabling meaningful representation learning and improved activity retrieval.

Contribution

It presents a novel unsupervised method for learning surgical motion representations through future prediction, outperforming previous methods in activity retrieval accuracy.

Findings

01

Learned encodings cluster by high-level activities.

02

Future prediction significantly outperforms baselines.

03

Achieved state-of-the-art F1 score of 0.77 in activity retrieval.

Abstract

We show that it is possible to learn meaningful representations of surgical motion, without supervision, by learning to predict the future. An architecture that combines an RNN encoder-decoder and mixture density networks (MDNs) is developed to model the conditional distribution over future motion given past motion. We show that the learned encodings naturally cluster according to high-level activities, and we demonstrate the usefulness of these learned encodings in the context of information retrieval, where a database of surgical motion is searched for suturing activity using a motion-based query. Future prediction with MDNs is found to significantly outperform simpler baselines as well as the best previously-published result for this task, advancing state-of-the-art performance from an F1 score of 0.60 +- 0.14 to 0.77 +- 0.05.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.