Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video
Guiqiu Liao, Matjaz Jogan, Sai Koushik, Eric Eaton, Daniel A., Hashimoto

TL;DR
This paper presents VDST-Net, a novel framework that disentangles spatiotemporal information in surgical videos to improve weakly supervised object detection and segmentation, especially when objects are transient and sparsely present.
Contribution
Introduction of VDST-Net, a semi-decoupled knowledge distillation approach that enhances segmentation accuracy in weakly supervised surgical videos by resolving temporal conflicts.
Findings
Outperforms state-of-the-art methods on public datasets.
Generates higher quality segmentation masks with limited frame annotations.
Effective in challenging surgical scenarios with sparse object presence.
Abstract
Weakly supervised video object segmentation (WSVOS) enables the identification of segmentation maps without requiring an extensive training dataset of object masks, relying instead on coarse video labels indicating object presence. Current state-of-the-art methods either require multiple independent stages of processing that employ motion cues or, in the case of end-to-end trainable networks, lack in segmentation accuracy, in part due to the difficulty of learning segmentation maps from videos with transient object presence. This limits the application of WSVOS for semantic annotation of surgical videos where multiple surgical tools frequently move in and out of the field of view, a problem that is more difficult than typically encountered in WSVOS. This paper introduces Video Spatio-Temporal Disentanglement Networks (VDST-Net), a framework to disentangle spatiotemporal information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Advanced Image Processing Techniques · AI in cancer detection
MethodsKnowledge Distillation
