SpinalSAM-R1: A Vision-Language Multimodal Interactive System for Spine CT Segmentation
Jiaming Liu, Dingwei Fan, Junyong Zhao, Chunlin Li, Haipeng Si, Liang Sun

TL;DR
SpinalSAM-R1 is a multimodal interactive system that combines vision and language models to improve spine CT segmentation, enabling natural language-guided refinement and achieving high accuracy and efficiency in clinical settings.
Contribution
The paper introduces SpinalSAM-R1, a novel multimodal system integrating a fine-tuned SAM with DeepSeek-R1 for improved spine CT segmentation with interactive, language-guided refinement capabilities.
Findings
Achieves superior segmentation performance on spine CT images.
Supports 11 clinical operations with 94.3% parsing accuracy.
Provides sub-800 ms response times for interactive prompts.
Abstract
The anatomical structure segmentation of the spine and adjacent structures from computed tomography (CT) images is a key step for spinal disease diagnosis and treatment. However, the segmentation of CT images is impeded by low contrast and complex vertebral boundaries. Although advanced models such as the Segment Anything Model (SAM) have shown promise in various segmentation tasks, their performance in spinal CT imaging is limited by high annotation requirements and poor domain adaptability. To address these limitations, we propose SpinalSAM-R1, a multimodal vision-language interactive system that integrates a fine-tuned SAM with DeepSeek-R1, for spine CT image segmentation. Specifically, our SpinalSAM-R1 introduces an anatomy-guided attention mechanism to improve spine segmentation performance, and a semantics-driven interaction protocol powered by DeepSeek-R1, enabling natural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Spinal Fractures and Fixation Techniques · Advanced Neural Network Applications
