Enhancing Diversity in Bayesian Deep Learning via Hyperspherical Energy Minimization of CKA
David Smerkous, Qinxun Bai, Fuxin Li

TL;DR
This paper introduces a novel method combining hyperspherical energy with CKA to enhance diversity in Bayesian deep learning models, leading to better uncertainty quantification and outlier detection.
Contribution
It proposes a new optimization approach using hyperspherical energy on CKA kernels to improve diversity and training stability in Bayesian deep learning ensembles.
Findings
Significantly outperforms baselines in uncertainty quantification.
Improves diversity in Bayesian deep learning models.
Enhances outlier detection capabilities.
Abstract
Particle-based Bayesian deep learning often requires a similarity metric to compare two networks. However, naive similarity metrics lack permutation invariance and are inappropriate for comparing networks. Centered Kernel Alignment (CKA) on feature kernels has been proposed to compare deep networks but has not been used as an optimization objective in Bayesian deep learning. In this paper, we explore the use of CKA in Bayesian deep learning to generate diverse ensembles and hypernetworks that output a network posterior. Noting that CKA projects kernels onto a unit hypersphere and that directly optimizing the CKA objective leads to diminishing gradients when two networks are very similar. We propose adopting the approach of hyperspherical energy (HE) on top of CKA kernels to address this drawback and improve training stability. Additionally, by leveraging CKA-based feature kernels, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Methods and Mixture Models · Face and Expression Recognition · Neural Networks and Applications
