HRSAM: Efficient Interactive Segmentation in High-Resolution Images
You Huang, Wenbin Lai, Jiayi Ji, Liujuan Cao, Shengchuan Zhang,, Rongrong Ji

TL;DR
HRSAM is a lightweight, high-resolution interactive segmentation model that generalizes well across resolutions, offering faster processing and improved accuracy over existing models through innovative attention and extrapolation techniques.
Contribution
We introduce HRSAM, a novel high-resolution segmentation model with a flexible attention framework and extrapolation capabilities, outperforming prior methods in speed and accuracy.
Findings
HRSAM surpasses previous state-of-the-art with 38% of the latency.
Extrapolation enables high-resolution generalization from low-resolution training.
Flash Swin attention accelerates processing by over 35%.
Abstract
The Segment Anything Model (SAM) has advanced interactive segmentation but is limited by the high computational cost on high-resolution images. This requires downsampling to meet GPU constraints, sacrificing the fine-grained details needed for high-precision interactive segmentation. To address SAM's limitations, we focus on visual length extrapolation and propose a lightweight model named HRSAM. The extrapolation enables HRSAM trained on low resolutions to generalize to high resolutions. We begin by finding the link between the extrapolation and attention scores, which leads us to base HRSAM on Swin attention. We then introduce the Flexible Local Attention (FLA) framework, using CUDA-optimized Efficient Memory Attention to accelerate HRSAM. Within FLA, we implement Flash Swin attention, achieving over a 35% speedup compared to traditional Swin attention, and propose a KV-only padding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Imaging and Analysis · Medical Image Segmentation Techniques
MethodsSoftmax · Attention Is All You Need · Focus · Balanced Selection
