Surgical Triplet Recognition via Diffusion Model
Daochang Liu, Axel Hu, Mubarak Shah, Chang Xu

TL;DR
This paper introduces DiffTriplet, a diffusion model-based framework for surgical triplet recognition that improves accuracy by jointly modeling triplets and components with association-guided denoising.
Contribution
The paper presents a novel diffusion model approach with association learning and guidance for surgical triplet recognition, achieving state-of-the-art results.
Findings
Outperforms existing methods on CholecT45 and CholecT50 datasets
Achieves higher accuracy in triplet recognition tasks
Demonstrates effectiveness of association-guided denoising
Abstract
Surgical triplet recognition is an essential building block to enable next-generation context-aware operating rooms. The goal is to identify the combinations of instruments, verbs, and targets presented in surgical video frames. In this paper, we propose DiffTriplet, a new generative framework for surgical triplet recognition employing the diffusion model, which predicts surgical triplets via iterative denoising. To handle the challenge of triplet association, two unique designs are proposed in our diffusion framework, i.e., association learning and association guidance. During training, we optimize the model in the joint space of triplets and individual components to capture the dependencies among them. At inference, we integrate association constraints into each update of the iterative denoising process, which refines the triplet prediction using the information of individual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Medical Imaging and Analysis · Radiomics and Machine Learning in Medical Imaging
MethodsDiffusion
