The Ray Tracing Sampler: Bayesian Sampling of Neural Networks for Everyone
Peter Behroozi

TL;DR
This paper introduces a novel ray tracing-based Bayesian sampler for neural networks that offers superior resilience to stochastic gradient heating and can efficiently sample complex posterior distributions, demonstrated on large models like GPT-2.
Contribution
The authors develop a new ray tracing sampler that generalizes existing methods and enables efficient Bayesian inference for large neural networks on consumer hardware.
Findings
Outperforms Hamiltonian Monte Carlo in resilience to heating
Successfully samples posterior of GPT-2 with 1.5 billion parameters
Provides a unified framework encompassing traditional samplers
Abstract
We derive a Markov Chain Monte Carlo sampler based on following ray paths in a medium where the refractive index is a function of the desired likelihood . The sampling method propagates rays at constant speed through parameter space, leading to orders of magnitude higher resilience to heating for stochastic gradients as compared to Hamiltonian Monte Carlo (HMC), as well as the ability to cross any likelihood barrier, including holes in parameter space. Using the ray tracing method, we sample the posterior distributions of neural network outputs for a variety of different architectures, up to the 1.5 billion-parameter GPT-2 (Generative Pre-trained Transformer 2) architecture, all on a single consumer-level GPU. We also show that prior samplers including traditional HMC, microcanonical HMC, Metropolis, Gibbs, and even Monte Carlo integration are special cases within…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
