Training Deep Neural Networks with Constrained Learning Parameters

Prasanna Date; Christopher D. Carothers; John E. Mitchell; James A.; Hendler; Malik Magdon-Ismail

arXiv:2009.00540·cs.LG·November 16, 2021

Training Deep Neural Networks with Constrained Learning Parameters

Prasanna Date, Christopher D. Carothers, John E. Mitchell, James A., Hendler, Malik Magdon-Ismail

PDF

TL;DR

This paper introduces CoNNTrA, a new algorithm for training deep neural networks with finite discrete parameters, enabling models that are memory-efficient and suitable for edge computing, without sacrificing accuracy.

Contribution

The paper proposes CoNNTrA, a coordinate gradient descent-based training method for discrete-parameter neural networks, with theoretical analysis and empirical validation on multiple datasets.

Findings

01

Models trained with CoNNTrA use 32x less memory than traditional methods.

02

CoNNTrA-trained models achieve comparable error rates to backpropagation.

03

CoNNTrA demonstrates feasibility for low-power, memory-constrained edge devices.

Abstract

Today's deep learning models are primarily trained on CPUs and GPUs. Although these models tend to have low error, they consume high power and utilize large amount of memory owing to double precision floating point learning parameters. Beyond the Moore's law, a significant portion of deep learning tasks would run on edge computing systems, which will form an indispensable part of the entire computation fabric. Subsequently, training deep learning models for such systems will have to be tailored and adopted to generate models that have the following desirable characteristics: low error, low memory, and low power. We believe that deep neural networks (DNNs), where learning parameters are constrained to have a set of finite discrete values, running on neuromorphic computing systems would be instrumental for intelligent edge computing systems having these desirable characteristics. To this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.