Capsule Vision Challenge 2024: Multi-Class Abnormality Classification for Video Capsule Endoscopy
Aakarsh Bansal, Bhuvanesh Singla, Raajan Rajesh Wankhade, Nagamma, Patil

TL;DR
This paper introduces a scalable, multi-stage classification model for detecting abnormalities in video capsule endoscopy frames, addressing data imbalance and learning complexity through augmentation and progressive training strategies.
Contribution
It proposes a novel tiered augmentation and training approach for multi-class abnormality classification in VCE, adaptable with different architectures.
Findings
Effective handling of data imbalance with augmentation.
Progressive training improves classification accuracy.
Flexible architecture supports various model types.
Abstract
This study presents an approach to developing a model for classifying abnormalities in video capsule endoscopy (VCE) frames. Given the challenges of data imbalance, we implemented a tiered augmentation strategy using the albumentations library to enhance minority class representation. Additionally, we addressed learning complexities by progressively structuring training tasks, allowing the model to differentiate between normal and abnormal cases and then gradually adding more specific classes based on data availability. Our pipeline, developed in PyTorch, employs a flexible architecture enabling seamless adjustments to classification complexity. We tested our approach using ResNet50 and a custom ViT-CNN hybrid model, with training conducted on the Kaggle platform. This work demonstrates a scalable approach to abnormality classification in VCE.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGastrointestinal Bleeding Diagnosis and Treatment
MethodsLib
