Resource-Efficient and Robust Inference of Deep and Bayesian Neural Networks on Embedded and Analog Computing Platforms

Bernhard Klein

arXiv:2510.24951·cs.LG·October 30, 2025

Resource-Efficient and Robust Inference of Deep and Bayesian Neural Networks on Embedded and Analog Computing Platforms

Bernhard Klein

PDF

TL;DR

This paper presents a comprehensive approach to making neural network inference more resource-efficient and robust on embedded and analog hardware by combining model compression, approximate Bayesian methods, hardware-aware optimization, and novel photonic computing paradigms.

Contribution

It introduces Galen for layer-specific model compression, models device noise for analog robustness, develops efficient Bayesian inference methods, and proposes probabilistic photonic computing for energy-efficient inference.

Findings

01

Galen achieves effective layer-specific compression guided by sensitivity analysis.

02

Modeling device imperfections improves robustness of analog accelerators.

03

Photonic computing enables fast, energy-efficient probabilistic inference.

Abstract

While modern machine learning has transformed numerous application domains, its growing computational demands increasingly constrain scalability and efficiency, particularly on embedded and resource-limited platforms. In practice, neural networks must not only operate efficiently but also provide reliable predictions under distributional shifts or unseen data. Bayesian neural networks offer a principled framework for quantifying uncertainty, yet their computational overhead further compounds these challenges. This work advances resource-efficient and robust inference for both conventional and Bayesian neural networks through the joint pursuit of algorithmic and hardware efficiency. The former reduces computation through model compression and approximate Bayesian inference, while the latter optimizes deployment on digital accelerators and explores analog hardware, bridging algorithmic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.