Endora: Video Generation Models as Endoscopy Simulators

Chenxin Li; Hengyu Liu; Yifan Liu; Brandon Y. Feng; Wuyang Li; Xinyu; Liu; Zhen Chen; Jing Shao; Yixuan Yuan

arXiv:2403.11050·cs.CV·March 19, 2024·1 cites

Endora: Video Generation Models as Endoscopy Simulators

Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu, Liu, Zhen Chen, Jing Shao, Yixuan Yuan

PDF

Open Access

TL;DR

Endora introduces a novel generative model for creating realistic clinical endoscopy videos, combining a spatial-temporal transformer with vision priors, and establishes a new benchmark for endoscopy simulation.

Contribution

The paper presents the first endoscopy video generation model integrating a spatial-temporal transformer and vision priors, along with a public benchmark for evaluation.

Findings

01

Outperforms existing methods in visual quality

02

Enables downstream video analysis tasks

03

Supports 3D scene generation with multi-view consistency

Abstract

Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for machine learning. Despite progress in generating 2D medical images, the complex domain of clinical video generation has largely remained untapped.This paper introduces \model, an innovative approach to generate medical videos that simulate clinical endoscopy scenes. We present a novel generative model design that integrates a meticulously crafted spatial-temporal video transformer with advanced 2D vision foundation model priors, explicitly modeling spatial-temporal dynamics during video generation. We also pioneer the first public benchmark for endoscopy simulation with video generation models, adapting existing state-of-the-art methods for this endeavor.Endora demonstrates exceptional visual quality in generating endoscopy videos, surpassing state-of-the-art methods…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAugmented Reality Applications · Surgical Simulation and Training