SNAP-UQ: Self-supervised Next-Activation Prediction for Single-Pass Uncertainty in TinyML
Ismail Lamaakal, Chaymae Yahyati, Khalid El Makkaoui, Ibrahim Ouahbi, Yassine Maleh

TL;DR
SNAP-UQ introduces a practical, single-pass, label-free uncertainty estimation method for TinyML that enhances failure detection and robustness without significant resource overhead.
Contribution
It proposes a novel depth-wise next-activation prediction approach for uncertainty estimation suitable for resource-constrained TinyML devices.
Findings
Reduces flash and latency by 40-60% compared to baselines.
Improves failure detection accuracy with higher AUPRC and AUROC scores.
Maintains strong uncertainty estimation in single forward passes.
Abstract
Reliable uncertainty estimation is a key missing piece for on-device monitoring in TinyML: microcontrollers must detect failures, distribution shift, or accuracy drops under strict flash/latency budgets, yet common uncertainty approaches (deep ensembles, MC dropout, early exits, temporal buffering) typically require multiple passes, extra branches, or state that is impractical on milliwatt hardware. This paper proposes a novel and practical method, SNAP-UQ, for single-pass, label-free uncertainty estimation based on depth-wise next-activation prediction. SNAP-UQ taps a small set of backbone layers and uses tiny int8 heads to predict the mean and scale of the next activation from a low-rank projection of the previous one; the resulting standardized prediction error forms a depth-wise surprisal signal that is aggregated and mapped through a lightweight monotone calibrator into an…
Peer Reviews
Decision·ICLR 2026 Poster
- The concept of next-activation applied in a layer-wise manner is interesting and demonstrates strong empirical performance. - The proposed lightweight architecture achieves impressive results despite its extremely compact size (on the order of kilobytes). - The effectiveness of the approach is validated across multiple tasks using practical evaluation metrics and reasonable baselines, yielding consistently encouraging results.
- Despite its potential significance, the paper’s writing quality is problematic. It appears to have been heavily compressed—possibly with assistance from LLMs—resulting in missing definitions for key terms and notations (e.g., LUT, BN, ID$\surd$-ID$\times$, ID$\surd$-OOD, $1\times 1$ projector, ...). Some paragraphs even consist of a single sentence, making the paper difficult to follow. - Although a theoretical analysis is included, it lacks sufficient discussion of its implications. The autho
- The paper is mostly well written and clear. - Uncertainty in TinyML is an important open problem, there are not many papers in this area, so a new method and new state of the art are welcome. - I believe the idea of predicting layer activations to produce uncertainty makes sense, this connects with previous ideas in the field like depth uncertainty and early exit ensembles. And the proposed method is specifically designed for TinyMl applications, which I believe is novel and significant. - The
- Some parts of the paper are hard to follow due to lots of mathematical notation, this is mostly in Sec 2.2 specifically at Eq 10 where the notation breaks. Overall I would recommend to the authors to try to minimize the necessary notation and to repeat the definitions of key notation in the paper, and try to explain concepts together with the math, this will make the paper much easier to understand and to implement. - The paper claims results on TinyImageNet but the results are nowhere in the
The main strength is the shift from output layer confidence to violations of conditional layer to layer dynamics as the basis for uncertainty, captured in a single forward pass without labels, with only tens of kilobytes of added memory and about $2%$ extra MACs, making it suitable for MCUs. Light theoretical support, namely the equivalence to depthwise negative log likelihood and conditional Mahalanobis energy and scale invariance under batch normalization like rescaling, grounds the intuition
1. **Tap placement feels ad-hoc.**: Two taps (mid / pre-logits) seem to work—but moving to a new backbone still feels like guesswork. A tiny, illustrative check would help: a quick cross-layer correlation sketch to show how signal fades with distance, plus a short “one-tap-removed” table that makes redundancy visible. 2. **Where the latency gains come from is not entirely clear.** While fairness is addressed, the source of the speedup is less so. Could you add a single stacked-bar breakdown—*ba
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Parallel Computing and Optimization Techniques · Advanced Neural Network Applications
