Unsupervised Learning for Surgical Motion by Learning to Predict the Future
Robert DiPietro, Gregory D. Hager

TL;DR
This paper introduces an unsupervised learning approach using an RNN and mixture density networks to predict future surgical motion, enabling meaningful representation learning and improved activity retrieval.
Contribution
It presents a novel unsupervised method for learning surgical motion representations through future prediction, outperforming previous methods in activity retrieval accuracy.
Findings
Learned encodings cluster by high-level activities.
Future prediction significantly outperforms baselines.
Achieved state-of-the-art F1 score of 0.77 in activity retrieval.
Abstract
We show that it is possible to learn meaningful representations of surgical motion, without supervision, by learning to predict the future. An architecture that combines an RNN encoder-decoder and mixture density networks (MDNs) is developed to model the conditional distribution over future motion given past motion. We show that the learned encodings naturally cluster according to high-level activities, and we demonstrate the usefulness of these learned encodings in the context of information retrieval, where a database of surgical motion is searched for suturing activity using a motion-based query. Future prediction with MDNs is found to significantly outperform simpler baselines as well as the best previously-published result for this task, advancing state-of-the-art performance from an F1 score of 0.60 +- 0.14 to 0.77 +- 0.05.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
