SA$^{2}$Net: Scale-Adaptive Structure-Affinity Transformation for Spine Segmentation from Ultrasound Volume Projection Imaging

Hao Xie; Zixun Huang; Yushen Zuo; Yakun Ju; Frank H. F. Leung; N. F. Law; Kin-Man Lam; Yong-Ping Zheng; Sai Ho Ling

arXiv:2510.26568·cs.CV·October 31, 2025

SA$^{2}$Net: Scale-Adaptive Structure-Affinity Transformation for Spine Segmentation from Ultrasound Volume Projection Imaging

Hao Xie, Zixun Huang, Yushen Zuo, Yakun Ju, Frank H. F. Leung, N. F. Law, Kin-Man Lam, Yong-Ping Zheng, Sai Ho Ling

PDF

TL;DR

SA$^{2}$Net is a novel deep learning model that improves spine segmentation from ultrasound images by capturing cross-dimensional features and encoding structural knowledge, aiding scoliosis diagnosis.

Contribution

The paper introduces a scale-adaptive, structure-aware network with a novel affinity transformation and feature mixing loss for enhanced spine segmentation accuracy.

Findings

01

Outperforms state-of-the-art segmentation methods

02

Achieves higher robustness and accuracy

03

Demonstrates adaptability to various backbone architectures

Abstract

Spine segmentation, based on ultrasound volume projection imaging (VPI), plays a vital role for intelligent scoliosis diagnosis in clinical applications. However, this task faces several significant challenges. Firstly, the global contextual knowledge of spines may not be well-learned if we neglect the high spatial correlation of different bone features. Secondly, the spine bones contain rich structural knowledge regarding their shapes and positions, which deserves to be encoded into the segmentation process. To address these challenges, we propose a novel scale-adaptive structure-aware network (SA $^{2}$ Net) for effective spine segmentation. First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images. Second, motivated by the consistency between multi-head self-attention in Transformers and semantic level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.