FlexICL: A Flexible Visual In-context Learning Framework for Elbow and Wrist Ultrasound Segmentation

Yuyue Zhou; Jessica Knight; Shrimanti Ghosh; Banafshe Felfeliyan; Jacob L. Jaremko; Abhilash R. Hareendranathan

arXiv:2510.26049·cs.CV·October 31, 2025

FlexICL: A Flexible Visual In-context Learning Framework for Elbow and Wrist Ultrasound Segmentation

Yuyue Zhou, Jessica Knight, Shrimanti Ghosh, Banafshe Felfeliyan, Jacob L. Jaremko, Abhilash R. Hareendranathan

PDF

TL;DR

FlexICL introduces a flexible in-context learning framework for ultrasound segmentation of elbow and wrist bones, achieving high accuracy with minimal labeled data and outperforming existing models.

Contribution

The paper presents a novel visual in-context learning approach with new concatenation techniques, enabling effective segmentation using only 5% of labeled ultrasound images.

Findings

01

Achieves 1-27% higher Dice scores than state-of-the-art models.

02

Requires only 5% of labeled data for robust performance.

03

Outperforms U-Net, TransUNet, Painter, and MAE-VQGAN on multiple datasets.

Abstract

Elbow and wrist fractures are the most common fractures in pediatric populations. Automatic segmentation of musculoskeletal structures in ultrasound (US) can improve diagnostic accuracy and treatment planning. Fractures appear as cortical defects but require expert interpretation. Deep learning (DL) can provide real-time feedback and highlight key structures, helping lightly trained users perform exams more confidently. However, pixel-wise expert annotations for training remain time-consuming and costly. To address this challenge, we propose FlexICL, a novel and flexible in-context learning (ICL) framework for segmenting bony regions in US images. We apply it to an intra-video segmentation setting, where experts annotate only a small subset of frames, and the model segments unseen frames. We systematically investigate various image concatenation techniques and training strategies for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.