Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML

Yassien Shaalan

arXiv:2603.24916·cs.LG·March 27, 2026

Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML

Yassien Shaalan

PDF

Open Access

TL;DR

HYPER-TINYPW introduces a generative compression method for tiny neural network mixers on microcontrollers, significantly reducing memory while maintaining accuracy and runtime performance across biosignal and speech tasks.

Contribution

It proposes a novel compression-as-generation approach that replaces stored weights with generated weights using a shared micro-MLP, enabling efficient TinyML deployment.

Findings

01

Reduces model size by 6.31x at similar accuracy.

02

Maintains at least 95% of large-model macro-F1 on ECG benchmarks.

03

Achieves 96.2% accuracy on Speech Commands with minimal memory.

Abstract

Deploying neural networks on microcontrollers is constrained by kilobytes of flash and SRAM, where 1x1 pointwise (PW) mixers often dominate memory even after INT8 quantization across vision, audio, and wearable sensing. We present HYPER-TINYPW, a compression-as-generation approach that replaces most stored PW weights with generated weights: a shared micro-MLP synthesizes PW kernels once at load time from tiny per-layer codes, caches them, and executes them with standard integer operators. This preserves commodity MCU runtimes and adds only a one-off synthesis cost; steady-state latency and energy match INT8 separable CNN baselines. Enforcing a shared latent basis across layers removes cross-layer redundancy, while keeping PW1 in INT8 stabilizes early, morphology-sensitive mixing. We contribute (i) TinyML-faithful packed-byte accounting covering generator, heads/factorization, codes,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmbedded Systems Design Techniques · Parallel Computing and Optimization Techniques · Low-power high-performance VLSI design