Partial Trace-Class Bayesian Neural Networks
Arran Carter, Torben Sell

TL;DR
This paper introduces three innovative partial trace-class Bayesian neural network architectures that achieve comparable uncertainty quantification to standard BNNs while significantly reducing computational and memory costs.
Contribution
The paper proposes novel partial trace-class BNN architectures that improve efficiency and scalability in uncertainty quantification for neural networks.
Findings
Achieve similar uncertainty quantification as standard BNNs
Reduce computational and memory requirements
Demonstrate effectiveness on real-world data
Abstract
Bayesian neural networks (BNNs) allow rigorous uncertainty quantification in deep learning, but often come at a prohibitive computational cost. We propose three different innovative architectures of partial trace-class Bayesian neural networks (PaTraC BNNs) that enable uncertainty quantification comparable to standard BNNs but use significantly fewer Bayesian parameters. These PaTraC BNNs have computational and statistical advantages over standard Bayesian neural networks in terms of speed and memory requirements. Our proposed methodology therefore facilitates reliable, robust, and scalable uncertainty quantification in neural networks. The three architectures build on trace-class neural network priors which induce an ordering of the neural network parameters, and are thus a natural choice in our framework. In a numerical simulation study, we verify the claimed benefits, and further…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
There are some numerical demonstrations on some small benchmark data sets.
The paper is largely built on trace-class BNN prior (Sell and Singh, 2023) and contains little novelty. The addition of NN architectures to select inference nodes is incremental. There are places confusing or having grammar issues, e.g. 'allows layers to be split into Bayesian and non-Bayesian parameters.' The paper could be strengthened by including comparison with more BNN methods.
1. Combining trace-class priors with partial Bayesianization is innovative and theoretically meaningful. 2. The trace-class prior’s natural ordering offers a principled way to select Bayesian nodes without arbitrary heuristics.
1. The baseline methods are insufficient. Comparisons are limited only to standard BNNs with a trace-class prior. It would be informative to benchmark against other partial BNNs, variational BNNs (e.g., Bayes by Backprop), ensembles, and SGMCMC methods. 2. The paper does not analyze how close the PaTraC posterior is to the full BNN posterior. A Wasserstein or KL-bound analysis would strengthen the claims. 3. The uncertainty estimation of some PaTraC results is not good enough. For example, as sh
1. **Interesting combination of ideas:** The paper creatively blends trace-class priors with partial Bayesian inference. Leveraging the inherent node ordering of the trace prior to guide parameter selection is an interesting strategy. 2. **Clear definition of architectures:** The three proposed PaTraC variants are described systematically (with figures), making it easy to distinguish their differences and intended use-cases. 3. **Empirical investigation of trade-offs:** The experiments attempt
1. **Limited novelty relative to existing pBNN literature.** Prior works on partially Bayesian or subnetwork BNNs (e.g., Daxberger et al., 2021; Izmailov et al., 2020) already propose training deterministic networks and then sampling Bayesian parameters on subsets of layers. The main difference here is using the trace‑class prior’s ordering to choose which nodes to Bayesianize. There is no theoretical analysis demonstrating that this selection method yields superior posterior approximations, and
1. The paper addresses the important trade-off between reliable uncertainty quantification and scalable Bayesian inference. It introduces three novel architectures (Sep-PaTraC, Out-PaTraC, and Mix-PaTraC) that effectively combine Bayesian and non-Bayesian components to achieve efficient uncertainty modeling. 2. The authors provide comprehensive empirical validation through experiments on both synthetic data and two real-world datasets (CIFAR-10 and Abalone), demonstrating the versatility of the
1. My main concern lies in the inference design, which may undermine the key advantage of using the trace-class prior. According to the authors, the trace-class prior is intended to “introduce a natural ordering of the prior weights and thus naturally lend them to truncation.” However, during pre-training, the standard neural network is trained without any prior information, and therefore, nodes with large ordering statistics may emerge in later layers. After Bayesian inference, these nodes migh
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference · Probabilistic and Robust Engineering Design
