Efficient On-device Training via Gradient Filtering

Yuedong Yang; Guihong Li; Radu Marculescu

arXiv:2301.00330·cs.CV·March 29, 2023

Efficient On-device Training via Gradient Filtering

Yuedong Yang, Guihong Li, Radu Marculescu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a gradient filtering method that enables efficient on-device CNN training by reducing computational and memory demands, demonstrated through extensive experiments on various models and devices.

Contribution

The paper proposes a novel gradient filtering technique that significantly reduces training complexity and memory usage, facilitating practical on-device CNN training.

Findings

01

Achieves up to 19× speedup on ImageNet classification

02

Provides 77.1% memory savings with minimal accuracy loss

03

Over 20× speedup and 90% energy savings on NVIDIA Jetson Nano

Abstract

Despite its importance for federated learning, continuous learning and many other applications, on-device training remains an open problem for EdgeAI. The problem stems from the large number of operations (e.g., floating point multiplications and additions) and memory consumption required during training by the back-propagation algorithm. Consequently, in this paper, we propose a new gradient filtering approach which enables on-device CNN model training. More precisely, our approach creates a special structure with fewer unique elements in the gradient map, thus significantly reducing the computational complexity and memory consumption of back propagation during training. Extensive experiments on image classification and semantic segmentation with multiple CNN models (e.g., MobileNet, DeepLabV3, UPerNet) and devices (e.g., Raspberry Pi and Jetson Nano) demonstrate the effectiveness and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sldgroup/gradientfilter-cvpr23
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Advanced Memory and Neural Computing · Brain Tumor Detection and Classification