BronchOpt : Vision-Based Pose Optimization with Fine-Tuned Foundation Models for Accurate Bronchoscopy Navigation

Hongchao Shu; Roger D. Soberanis-Mukul; Jiru Xu; Hao Ding; Morgan Ringel; Mali Shen; Saif Iftekar Sayed; Hedyeh Rafii-Tari; Mathias Unberath

arXiv:2511.09443·cs.CV·November 13, 2025

BronchOpt : Vision-Based Pose Optimization with Fine-Tuned Foundation Models for Accurate Bronchoscopy Navigation

Hongchao Shu, Roger D. Soberanis-Mukul, Jiru Xu, Hao Ding, Morgan Ringel, Mali Shen, Saif Iftekar Sayed, Hedyeh Rafii-Tari, Mathias Unberath

PDF

Open Access

TL;DR

This paper introduces BronchOpt, a vision-based framework utilizing fine-tuned foundation models for accurate, domain-invariant bronchoscopy navigation, supported by a new synthetic dataset for standardized evaluation.

Contribution

It presents a novel pose optimization method with a domain-invariant encoder and a synthetic benchmark dataset for bronchoscopy navigation.

Findings

01

Achieves 2.65 mm translational error and 0.19 rad rotational error on synthetic data.

02

Demonstrates strong cross-domain generalization on real patient data.

03

Provides a publicly available synthetic dataset for benchmarking.

Abstract

Accurate intra-operative localization of the bronchoscope tip relative to patient anatomy remains challenging due to respiratory motion, anatomical variability, and CT-to-body divergence that cause deformation and misalignment between intra-operative views and pre-operative CT. Existing vision-based methods often fail to generalize across domains and patients, leading to residual alignment errors. This work establishes a generalizable foundation for bronchoscopy navigation through a robust vision-based framework and a new synthetic benchmark dataset that enables standardized and reproducible evaluation. We propose a vision-based pose optimization framework for frame-wise 2D-3D registration between intra-operative endoscopic views and pre-operative CT anatomy. A fine-tuned modality- and domain-invariant encoder enables direct similarity computation between real endoscopic RGB frames and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoft Robotics and Applications · Surgical Simulation and Training · Augmented Reality Applications