SurgSora: Object-Aware Diffusion Model for Controllable Surgical Video Generation
Tong Chen, Shuya Yang, Junyi Wang, Long Bai, Hongliang Ren, Luping Zhou

TL;DR
SurgSora is a novel diffusion-based framework that generates realistic, controllable surgical videos from a single image, utilizing object-aware features and user-guided motion control to improve medical training tools.
Contribution
The paper introduces SurgSora, a new object-aware diffusion model that enables fine-grained, user-controllable surgical video synthesis with enhanced realism and motion accuracy.
Findings
Achieves state-of-the-art visual authenticity in surgical video generation.
Enables precise user-guided object motion control.
Demonstrates high realism through expert surgeon evaluations.
Abstract
Surgical video generation can enhance medical education and research, but existing methods lack fine-grained motion control and realism. We introduce SurgSora, a framework that generates high-fidelity, motion-controllable surgical videos from a single input frame and user-specified motion cues. Unlike prior approaches that treat objects indiscriminately or rely on ground-truth segmentation masks, SurgSora leverages self-predicted object features and depth information to refine RGB appearance and optical flow for precise video synthesis. It consists of three key modules: (1) the Dual Semantic Injector, which extracts object-specific RGB-D features and segmentation cues to enhance spatial representations; (2) the Decoupled Flow Mapper, which fuses multi-scale optical flow with semantic features for realistic motion dynamics; and (3) the Trajectory Controller, which estimates sparse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques
MethodsDiffusion
