XAttnRes: Cross-Stage Attention Residuals for Medical Image Segmentation
Xinyu Liu, Qing Xu, Zhen Chen

TL;DR
XAttnRes introduces a novel attention residual mechanism for medical image segmentation that enhances feature aggregation across stages, improving performance across multiple datasets and modalities.
Contribution
The paper presents Cross-Stage Attention Residuals (XAttnRes), a lightweight, learnable aggregation method that bridges encoder-decoder stages and improves segmentation accuracy.
Findings
XAttnRes improves performance across four datasets and three imaging modalities.
XAttnRes alone can match baseline performance without skip connections.
XAttnRes effectively handles cross-resolution features with negligible overhead.
Abstract
In the field of Large Language Models (LLMs), Attention Residuals have recently demonstrated that learned, selective aggregation over all preceding layer outputs can outperform fixed residual connections. We propose Cross-Stage Attention Residuals (XAttnRes), a mechanism that maintains a global feature history pool accumulating both encoder and decoder stage outputs. Through lightweight pseudo-query attention, each stage selectively aggregates from all preceding representations. To bridge the gap between the same-dimensional Transformer layers in LLMs and the multi-scale encoder-decoder stages in segmentation networks, XAttnRes introduces spatial alignment and channel projection steps that handle cross-resolution features with negligible overhead. When added to existing segmentation networks, XAttnRes consistently improves performance across four datasets and three imaging modalities.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
