Structured Model Pruning for Efficient Inference in Computational Pathology
Mohammed Adnan, Qinle Ba, Nazim Shaikh, Shivam Kalra, Satarupa, Mukherjee, Auranuch Lorsakul

TL;DR
This paper presents a pruning methodology for U-Net models in computational pathology, achieving at least 70% model size reduction with minimal performance loss, thereby enabling more efficient AI inference in healthcare applications.
Contribution
It introduces a novel pruning approach tailored for biomedical U-Net architectures, demonstrating significant compression with negligible accuracy impact.
Findings
Models can be compressed by at least 70%
Pruning maintains high segmentation and classification accuracy
Multiple heuristics evaluated for optimal pruning strategy
Abstract
Recent years have seen significant efforts to adopt Artificial Intelligence (AI) in healthcare for various use cases, from computer-aided diagnosis to ICU triage. However, the size of AI models has been rapidly growing due to scaling laws and the success of foundational models, which poses an increasing challenge to leverage advanced models in practical applications. It is thus imperative to develop efficient models, especially for deploying AI solutions under resource-constrains or with time sensitivity. One potential solution is to perform model compression, a set of techniques that remove less important model components or reduce parameter precision, to reduce model computation demand. In this work, we demonstrate that model pruning, as a model compression technique, can effectively reduce inference cost for computational and digital pathology based analysis with a negligible loss of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection
MethodsSparse Evolutionary Training · Pruning
