Weakly Supervised Segmentation of Hyper-Reflective Foci with Compact Convolutional Transformers and SAM2
Olivier Morelle (1, 2), Justus Bisten (1), Maximilian W. M., Wintergerst (2, 5), Robert P. Finger (2, 4), Thomas Schultz (1, 3), ((1) B-IT, Department of Computer Science, University of Bonn, (2), Department of Ophthalmology, University Hospital Bonn, (3) Lamarr Institute

TL;DR
This paper introduces a novel weakly supervised segmentation framework for small structures in OCT images, combining attention-based MIL, Layer-wise Relevance Propagation, SAM2, and Compact Convolutional Transformers to improve resolution and accuracy.
Contribution
It proposes a new approach that enhances weakly supervised segmentation of small structures by integrating LRP, SAM2, and CCT, overcoming limitations of coarse localization and downsampling.
Findings
Improved segmentation accuracy for hyper-reflective foci in OCT images.
Enhanced spatial resolution and recall through iterative inference.
Effective use of CCT and SAM2 in weakly supervised segmentation.
Abstract
Weakly supervised segmentation has the potential to greatly reduce the annotation effort for training segmentation models for small structures such as hyper-reflective foci (HRF) in optical coherence tomography (OCT). However, most weakly supervised methods either involve a strong downsampling of input images, or only achieve localization at a coarse resolution, both of which are unsatisfactory for small structures. We propose a novel framework that increases the spatial resolution of a traditional attention-based Multiple Instance Learning (MIL) approach by using Layer-wise Relevance Propagation (LRP) to prompt the Segment Anything Model (SAM~2), and increases recall with iterative inference. Moreover, we demonstrate that replacing MIL with a Compact Convolutional Transformer (CCT), which adds a positional encoding, and permits an exchange of information between different regions of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptical Systems and Laser Technology · Advanced SAR Imaging Techniques
MethodsAbsolute Position Encodings · Adam · Residual Connection · Dropout · Softmax · Byte Pair Encoding · Linear Layer · Attention Is All You Need · Multi-Head Attention · Position-Wise Feed-Forward Layer
