Conservative & Aggressive NaNs Accelerate U-Nets for Neuroimaging
In\'es Gonzalez-Pepe, Vinuyan Sivakolunthu, Jacob Fortin, Yohan Chatelain, Tristan Glatard

TL;DR
This paper introduces Conservative & Aggressive NaNs, novel methods that identify and skip redundant computations in CNNs for neuroimaging, leading to significant speedups without performance loss.
Contribution
The paper presents two new variants of max pooling and unpooling that leverage numerical uncertainty to accelerate CNN inference in neuroimaging applications.
Findings
Up to 1.67x inference speedup with >50% NaNs in inputs.
Conservative NaNs reduce convolution operations by 30% on average.
Aggressive NaNs skip up to 69.3% of convolutions, with some impact on performance.
Abstract
Deep learning models for neuroimaging increasingly rely on large architectures, making efficiency a persistent concern despite advances in hardware. Through an analysis of numerical uncertainty of convolutional neural networks (CNNs), we observe that many operations are applied to values dominated by numerical noise and have negligible influence on model outputs. In some models, up to two-thirds of convolution operations appear redundant. We introduce Conservative & Aggressive NaNs, two novel variants of max pooling and unpooling that identify numerically unstable voxels and replace them with NaNs, allowing subsequent layers to skip computations on irrelevant data. Both methods are implemented within PyTorch and require no architectural changes. We evaluate these approaches on four CNN models spanning neuroimaging and image classification tasks. For inputs containing at least 50% NaNs,…
Peer Reviews
Decision·Submitted to ICLR 2026
- This paper asserts that the paper's primary strength is identifying a novel source of inefficiency. Instead of focusing on weight redundancy (like pruning), it targets numerical instability from pooling as a source of wasted computation. - In this paper, the Conservative NaNs method provides a significant compute reduction (~30%) with no measurable loss in accuracy (e.g., Dice/PSNR scores) on the tested neuroimaging models. - In the experiments, the analysis is not just theoretical (FLOPS). Th
- This is the most significant weakness. The paper does not experimentally compare its method against other established acceleration techniques like pruning or quantization. It claims the method is orthogonal, but it provides no data to show if a 30% NaN skip is better or worse than a 30% so-called pruned model. In other words, other lightweight model techniques should be compared. - The method's effectiveness is limited to specific data types. It works well on homogeneous data (MRIs) but provi
+ The paper targets a component of many CNN architectures. Any demonstrable improvement in the efficiency of pooling/unpooling operations would be a valuable and widely applicable contribution to the field. + The proposed mechanism for pooling/unpooling appears to be a new design, and the core idea may be of interest to the community.
+ Severe Lack of Scholarly Context: The paper is not properly situated within the existing scientific literature. + No Problem Discussion: The introduction asserts that a problem exists but fails to provide a clear, evidence-based discussion of what this inefficiency is, why it is a problem, or how it impacts current models, all of which would require citations. + No References in Introduction: The introduction is presented without a single citation, making its claims appear unsubstantiated and
- **Well-motivated and intuitive concept**: The core observation—that uninformative pixels in homogeneous regions (like image backgrounds) cause unstable and inefficient pooling/unpooling operations—is sound and clearly explained. The proposed solutions, "conservative" and "aggressive" NaN-based pooling, are conceptually simple and easy to understand. - **Practical impact on efficiency**: The demonstrated improvement in computational efficiency is a significant strength, particularly for applica
- **Limited and small-scale validation**: The experimental validation on the medical imaging networks (FastSurfer and FONDUE) is a significant weakness. Using only five subjects is insufficient to draw strong conclusions about the method's general performance and robustness. While the authors ensured these subjects came from different acquisition sites, the sample size is too small to cover a diverse range of anatomical variability and pathologies. Furthermore, the evaluation is confined to stru
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
