TT-Edge: A Hardware-Software Co-Design for Energy-Efficient Tensor-Train Decomposition on Edge AI
Hyunseok Kwak, Kyeongwon Lee, Kyeongpil Min, Chaebin Jung, Woojoo Lee

TL;DR
TT-Edge is a hardware-software co-designed framework that accelerates tensor train decomposition on edge devices, significantly reducing energy consumption and latency for model compression in resource-constrained environments.
Contribution
The paper introduces TT-Edge, a specialized hardware-software co-designed system that offloads compute-intensive SVD tasks to a dedicated engine, improving efficiency of tensor train decomposition on edge AI processors.
Findings
Achieves 1.7x speedup over GEMM baseline
Reduces energy consumption by 40.2%
Minimal hardware overhead with a lightweight design
Abstract
The growing demands of distributed learning on resource constrained edge devices underscore the importance of efficient on device model compression. Tensor Train Decomposition (TTD) offers high compression ratios with minimal accuracy loss, yet repeated singular value decompositions (SVDs) and matrix multiplications can impose significant latency and energy costs on low power processors. In this work, we present TT-Edge, a hardware software co designed framework aimed at overcoming these challenges. By splitting SVD into two phases--bidiagonalization and diagonalization--TT-Edge offloads the most compute intensive tasks to a specialized TTD Engine. This engine integrates tightly with an existing GEMM accelerator, thereby curtailing the frequent matrix vector transfers that often undermine system performance and energy efficiency. Implemented on a RISC-V-based edge AI processor, TT-Edge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Low-power high-performance VLSI design · Parallel Computing and Optimization Techniques
