Differential Attention-Augmented BiomedCLIP with Asymmetric Focal Optimization for Imbalanced Multi-Label Video Capsule Endoscopy Classification
Podakanti Satyajith Chary, Nagarajan Ganapathy

TL;DR
This paper introduces a novel multi-label video capsule endoscopy classification framework that enhances BiomedCLIP with differential attention and asymmetric focal optimization to effectively handle extreme class imbalance and improve detection accuracy.
Contribution
It proposes a differential attention mechanism and a comprehensive optimization strategy to address class imbalance in biomedical video classification tasks.
Findings
Achieved a temporal [email protected] of 0.2456 on RARE-VISION dataset.
Implemented a pipeline with 8.6-minute inference time on a single GPU.
Effectively suppressed attention noise and handled severe class imbalance.
Abstract
This work presents a multi-label classification framework for video capsule endoscopy (VCE) that addresses the extreme class imbalance inherent in the Galar dataset through a combination of architectural and optimization-level strategies. Our approach modifies BiomedCLIP, a biomedical vision-language foundation model, by replacing its standard multi-head self-attention with a differential attention mechanism that computes the difference between two softmax attention maps to suppress attention noise. To counteract the skewed label distribution, where pathological findings constitute less than 0.1% of all annotated frames, a sqrt-frequency weighted sampler, asymmetric focal loss, mixup regularization, and per-class threshold optimization are employed. Temporal coherence is enforced through median-filter smoothing and gap merging prior to event-level JSON generation. On the held-out…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGastrointestinal Bleeding Diagnosis and Treatment · Retinal Imaging and Analysis · Colorectal Cancer Screening and Detection
