T3C: Test-Time Tensor Compression with Consistency Guarantees
Ismail Lamaakal, Chaymae Yahyati, Yassine Maleh, Khalid El Makkaoui, Ibrahim Ouahbi

TL;DR
T3C introduces a flexible, reliable tensor compression framework that allows dynamic adjustment of model size, latency, and energy consumption at test time, with guarantees on output consistency and improved performance on ImageNet-1k.
Contribution
It combines elastic tensor factorization, mixed-precision quantization, and a novel consistency certificate to enable test-time budget adaptation with reliability guarantees.
Findings
Outperforms PTQ-8b in latency and size on ResNet-50.
Achieves significant latency improvements for ViT-B/16.
Provides predictable, certificate-backed trade-offs across devices.
Abstract
We present T3C, a train-once, test-time budget-conditioned compression framework that exposes rank and precision as a controllable deployment knob. T3C combines elastic tensor factorization (maintained up to a maximal rank) with rank-tied mixed-precision quantization and a lightweight controller that maps a latency/energy/size budget token to per-layer rank/bit assignments; the policy snaps to hardware-aligned profiles and is monotone in the budget. A fast, layerwise consistency certificate, computed from spectral proxies and activation statistics, upper-bounds logit drift and regularizes training, yielding a practical reliability signal with negligible overhead. On ImageNet-1k, T3C shifts the vision Pareto frontier: for ResNet-50 at matched accuracy (\leq 0.5% drop), p50 latency is 1.18ms with a 38MB model, outperforming PTQ-8b (1.44ms, 88MB); for ViT-B/16, T3C reaches 2.30ms p50 with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Sparse and Compressive Sensing Techniques · Generative Adversarial Networks and Image Synthesis
