Algorithm for Training Neural Networks on Resistive Device Arrays
Tayfun Gokmen, Wilfried Haensch

TL;DR
The paper introduces the "Tiki-Taka" algorithm, enabling neural network training on resistive device arrays without requiring symmetrical conductance switching, thus improving hardware feasibility and maintaining accuracy.
Contribution
It presents a novel training algorithm that relaxes the symmetry requirement of resistive devices, allowing effective neural network training on non-ideal hardware.
Findings
Achieves same accuracy with non-symmetric devices as with ideal symmetric devices.
Maintains parallel operations and low implementation cost.
Enhances hardware robustness and performance for resistive crossbar arrays.
Abstract
Hardware architectures composed of resistive cross-point device arrays can provide significant power and speed benefits for deep neural network training workloads using stochastic gradient descent (SGD) and backpropagation (BP) algorithm. The training accuracy on this imminent analog hardware however strongly depends on the switching characteristics of the cross-point elements. One of the key requirements is that these resistive devices must change conductance in a symmetrical fashion when subjected to positive or negative pulse stimuli. Here, we present a new training algorithm, so-called the "Tiki-Taka" algorithm, that eliminates this stringent symmetry requirement. We show that device asymmetry introduces an unintentional implicit cost term into the SGD algorithm, whereas in the "Tiki-Taka" algorithm a coupled dynamical system simultaneously minimizes the original objective function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Stochastic Gradient Descent
