TL;DR
This paper introduces a Gibbs entropy-based proposition linking randomness and compression in deep learning, supported by a novel Dual Tomographic Compression framework that enhances training efficiency and neural architecture search.
Contribution
It presents the Gibbs randomness-compression proposition and a Dual Tomographic Compression method, advancing energy-efficient deep learning and neural pruning techniques.
Findings
High correlation between Gibbs entropy and learning performance
DTC accelerates training and supports lottery ticket hypothesis
Random compress-train iterations perform comparably to deterministic methods
Abstract
A proposition that connects randomness and compression is put forward via Gibbs entropy over set of measurement vectors associated with a compression process. The proposition states that a lossy compression process is equivalent to {\it directed randomness} that preserves information content. The proposition originated from the observed behavior in newly proposed {\it Dual Tomographic Compression} (DTC) compress-train framework. This is akin to tomographic reconstruction of layer weight matrices via building compressed sensed projections, via so-called {\it weight rays}. This tomographic approach is applied to previous and next layers in a dual fashion, that triggers neuronal-level pruning. This novel model compress-train scheme appears in iterative fashion and acts as a smart neural architecture search: also called {\it compression aware training}. The experiments demonstrated the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
