TL;DR
SurGrID introduces a novel scene graph to image diffusion model that enables controllable, high-fidelity surgical scene synthesis, improving realism and interactivity over existing methods.
Contribution
The paper presents SurGrID, a new method that uses scene graphs for precise control and high-quality image generation in surgical simulation, with a novel pre-training step for better scene understanding.
Findings
Enhanced image fidelity and scene coherence compared to state-of-the-art.
Effective use of scene graphs for interactive and precise surgical scene control.
Positive user assessment results with clinical experts.
Abstract
Surgical simulation offers a promising addition to conventional surgical training. However, available simulation tools lack photorealism and rely on hardcoded behaviour. Denoising Diffusion Models are a promising alternative for high-fidelity image synthesis, but existing state-of-the-art conditioning methods fall short in providing precise control or interactivity over the generated scenes. We introduce SurGrID, a Scene Graph to Image Diffusion Model, allowing for controllable surgical scene synthesis by leveraging Scene Graphs. These graphs encode a surgical scene's components' spatial and semantic information, which are then translated into an intermediate representation using our novel pre-training step that explicitly captures local and global information. Our proposed method improves the fidelity of generated images and their coherence with the graph input over the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
