FlexICL: A Flexible Visual In-context Learning Framework for Elbow and Wrist Ultrasound Segmentation
Yuyue Zhou, Jessica Knight, Shrimanti Ghosh, Banafshe Felfeliyan, Jacob L. Jaremko, Abhilash R. Hareendranathan

TL;DR
FlexICL introduces a flexible in-context learning framework for ultrasound segmentation of elbow and wrist bones, achieving high accuracy with minimal labeled data and outperforming existing models.
Contribution
The paper presents a novel visual in-context learning approach with new concatenation techniques, enabling effective segmentation using only 5% of labeled ultrasound images.
Findings
Achieves 1-27% higher Dice scores than state-of-the-art models.
Requires only 5% of labeled data for robust performance.
Outperforms U-Net, TransUNet, Painter, and MAE-VQGAN on multiple datasets.
Abstract
Elbow and wrist fractures are the most common fractures in pediatric populations. Automatic segmentation of musculoskeletal structures in ultrasound (US) can improve diagnostic accuracy and treatment planning. Fractures appear as cortical defects but require expert interpretation. Deep learning (DL) can provide real-time feedback and highlight key structures, helping lightly trained users perform exams more confidently. However, pixel-wise expert annotations for training remain time-consuming and costly. To address this challenge, we propose FlexICL, a novel and flexible in-context learning (ICL) framework for segmenting bony regions in US images. We apply it to an intra-video segmentation setting, where experts annotate only a small subset of frames, and the model segments unseen frames. We systematically investigate various image concatenation techniques and training strategies for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
