Surg$\Sigma$: A Spectrum of Large-Scale Multimodal Data and Foundation Models for Surgical Intelligence
Zhitao Zeng, Mengya Xu, Jian Jiang, Pengfei Guo, Yunqiu Xu, Zhu Zhuo, Chang Han Low, Yufan He, Dong Yang, Chenxi Lin, Yiming Gu, Jiaxin Guo, Yutong Ban, Daguang Xu, Qi Dou, Yueming Jin

TL;DR
SurgΣ introduces a large-scale, multimodal surgical data repository and foundation models that enhance generalization and interpretability across diverse surgical tasks and specialties.
Contribution
The paper presents SurgΣ-DB, a comprehensive multimodal surgical dataset, and demonstrates foundation models that leverage this data for improved surgical intelligence.
Findings
Enhanced cross-task generalization in surgical models
Rich hierarchical reasoning annotations improve contextual understanding
Large-scale multimodal data supports diverse surgical applications
Abstract
Surgical intelligence has the potential to improve the safety and consistency of surgical care, yet most existing surgical AI frameworks remain task-specific and struggle to generalize across procedures and institutions. Although multimodal foundation models, particularly multimodal large language models, have demonstrated strong cross-task capabilities across various medical domains, their advancement in surgery remains constrained by the lack of large-scale, systematically curated multimodal data. To address this challenge, we introduce Surg, a spectrum of large-scale multimodal data and foundation models for surgical intelligence. At the core of this framework lies Surg-DB, a large-scale multimodal data foundation designed to support diverse surgical tasks. Surg-DB consolidates heterogeneous surgical data sources (including open-source datasets, curated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Surgical Simulation and Training · Artificial Intelligence in Healthcare and Education
