PowerTrain: Fast, Generalizable Time and Power Prediction Models to   Optimize DNN Training on Accelerated Edges

Prashanthi S.K.; Saisamarth Taluri; Beautlin S; Lakshya Karwa; Yogesh; Simmhan

arXiv:2407.13944·cs.DC·July 22, 2024

PowerTrain: Fast, Generalizable Time and Power Prediction Models to Optimize DNN Training on Accelerated Edges

Prashanthi S.K., Saisamarth Taluri, Beautlin S, Lakshya Karwa, Yogesh, Simmhan

PDF

TL;DR

PowerTrain is a transfer-learning framework that accurately predicts DNN training time and power consumption on edge devices, enabling efficient power-performance trade-off optimization with minimal profiling.

Contribution

It introduces a transfer-learning approach that requires only minimal additional profiling to adapt power and time prediction models to new workloads and devices.

Findings

01

Achieves less than 6% MAPE for power prediction and less than 15% for time on new workloads.

02

Maintains prediction errors below 14.5% on different Jetson devices.

03

Outperforms baseline methods by over 10% in prediction accuracy and up to 45% in optimization efficiency.

Abstract

Accelerated edge devices, like Nvidia's Jetson with 1000+ CUDA cores, are increasingly used for DNN training and federated learning, rather than just for inferencing workloads. A unique feature of these compact devices is their fine-grained control over CPU, GPU, memory frequencies, and active CPU cores, which can limit their power envelope in a constrained setting while throttling the compute performance. Given this vast 10k+ parameter space, selecting a power mode for dynamically arriving training workloads to exploit power-performance trade-offs requires costly profiling for each new workload, or is done \textit{ad hoc}. We propose \textit{PowerTrain}, a transfer-learning approach to accurately predict the power and time consumed when training a given DNN workload (model + dataset) using any specified power mode (CPU/GPU/memory frequencies, core-count). It requires a one-time offline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.