ResNet-50 with Class Reweighting and Anatomy-Guided Temporal Decoding for Gastrointestinal Video Analysis
Romil Imtiaz, Dimitris K. Iakovidis

TL;DR
This paper presents a gastrointestinal video analysis system using ResNet-50 with class reweighting and anatomy-guided decoding, improving label prediction and temporal event accuracy.
Contribution
It introduces a novel combination of class reweighting and anatomy-guided temporal decoding for better video analysis performance.
Findings
Improved rare class learning with class-wise positive weighting.
Enhanced temporal event detection accuracy from 0.3801 to 0.4303 mAP.
Combined GT-style event composition with anatomy-based gating for stability.
Abstract
We developed a multi-label gastrointestinal video analysis pipeline based on a ResNet-50 frame classifier followed by anatomy-guided temporal event decoding. The system predicts 17 labels, including 5 anatomy classes and 12 pathology classes, from frames resized to 336x336. A major challenge was severe class imbalance, particularly for rare pathology labels. To address this, we used clipped class-wise positive weighting in the training loss, which improved rare-class learning while maintaining stable optimization. At the temporal stage, we found that direct frame-to-event conversion produced fragmented mismatches with the official ground truth. The final submission therefore combined GT-style framewise event composition, anatomy vote smoothing, and anatomy-based pathology gating with a conservative hysteresis decoder. This design improved the final temporal mAP from 0.3801 to 0.4303 on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
