Score: 0

Harnessing Foundation Models for Robust and Generalizable 6-DOF Bronchoscopy Localization

Published: May 30, 2025 | arXiv ID: 2505.24249v1

By: Qingyao Tian , Huai Liao , Xinyan Huang and more

Potential Business Impact:

Helps doctors see inside lungs better.

Business Areas:

Image Recognition Data and Analytics, Software

Vision-based 6-DOF bronchoscopy localization offers a promising solution for accurate and cost-effective interventional guidance. However, existing methods struggle with 1) limited generalization across patient cases due to scarce labeled data, and 2) poor robustness under visual degradation, as bronchoscopy procedures frequently involve artifacts such as occlusions and motion blur that impair visual information. To address these challenges, we propose PANSv2, a generalizable and robust bronchoscopy localization framework. Motivated by PANS that leverages multiple visual cues for pose likelihood measurement, PANSv2 integrates depth estimation, landmark detection, and centerline constraints into a unified pose optimization framework that evaluates pose probability and solves for the optimal bronchoscope pose. To further enhance generalization capabilities, we leverage the endoscopic foundation model EndoOmni for depth estimation and the video foundation model EndoMamba for landmark detection, incorporating both spatial and temporal analyses. Pretrained on diverse endoscopic datasets, these models provide stable and transferable visual representations, enabling reliable performance across varied bronchoscopy scenarios. Additionally, to improve robustness to visual degradation, we introduce an automatic re-initialization module that detects tracking failures and re-establishes pose using landmark detections once clear views are available. Experimental results on bronchoscopy dataset encompassing 10 patient cases show that PANSv2 achieves the highest tracking success rate, with an 18.1% improvement in SR-5 (percentage of absolute trajectory error under 5 mm) compared to existing methods, showing potential towards real clinical usage.

BronchOpt : Vision-Based Pose Optimization with Fine-Tuned Foundation Models for Accurate Bronchoscopy Navigation

CV and Pattern Recognition

Helps doctors see inside lungs better during surgery.

12 Nov 2025 1

90%

BREATH-VL: Vision-Language-Guided 6-DoF Bronchoscopy Localization via Semantic-Geometric Fusion

CV and Pattern Recognition

Helps doctors find their way inside bodies.

7 Jan 2026 1

87%

Online Topological Localization for Navigation Assistance in Bronchoscopy

CV and Pattern Recognition

Helps doctors find lungs' hidden paths without scans.

10 Oct 2025 0

View PDF Login to Bookmark

Page Count

9 pages

Harnessing Foundation Models for Robust and Generalizable 6-DOF Bronchoscopy Localization

Helps doctors see inside lungs better.

Technical Abstract

BronchOpt : Vision-Based Pose Optimization with Fine-Tuned Foundation Models for Accurate Bronchoscopy Navigation

BREATH-VL: Vision-Language-Guided 6-DoF Bronchoscopy Localization via Semantic-Geometric Fusion

Online Topological Localization for Navigation Assistance in Bronchoscopy