BronchOpt : Vision-Based Pose Optimization with Fine-Tuned Foundation Models for Accurate Bronchoscopy Navigation
Hongchao Shu, Roger D. Soberanis-Mukul, Jiru Xu, Hao Ding, Morgan Ringel, Mali Shen, Saif Iftekar Sayed, Hedyeh Rafii-Tari, Mathias Unberath

TL;DR
This paper introduces BronchOpt, a vision-based framework utilizing fine-tuned foundation models for accurate, domain-invariant bronchoscopy navigation, supported by a new synthetic dataset for standardized evaluation.
Contribution
It presents a novel pose optimization method with a domain-invariant encoder and a synthetic benchmark dataset for bronchoscopy navigation.
Findings
Achieves 2.65 mm translational error and 0.19 rad rotational error on synthetic data.
Demonstrates strong cross-domain generalization on real patient data.
Provides a publicly available synthetic dataset for benchmarking.
Abstract
Accurate intra-operative localization of the bronchoscope tip relative to patient anatomy remains challenging due to respiratory motion, anatomical variability, and CT-to-body divergence that cause deformation and misalignment between intra-operative views and pre-operative CT. Existing vision-based methods often fail to generalize across domains and patients, leading to residual alignment errors. This work establishes a generalizable foundation for bronchoscopy navigation through a robust vision-based framework and a new synthetic benchmark dataset that enables standardized and reproducible evaluation. We propose a vision-based pose optimization framework for frame-wise 2D-3D registration between intra-operative endoscopic views and pre-operative CT anatomy. A fine-tuned modality- and domain-invariant encoder enables direct similarity computation between real endoscopic RGB frames and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoft Robotics and Applications · Surgical Simulation and Training · Augmented Reality Applications
