Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions
Zhaoxian Wu, Quan Xiao, Tayfun Gokmen, Omobayode Fagbohungbe, Tianyi Chen

TL;DR
This paper investigates how non-ideal, asymmetric, and non-linear response functions in analog in-memory computing hardware affect gradient-based training, proposing a residual learning algorithm to ensure convergence despite these hardware imperfections.
Contribution
It introduces a theoretical framework for training on AIMC with non-ideal response functions and proposes a residual learning algorithm that guarantees convergence.
Findings
Asymmetric response functions impair Analog SGD performance.
The Residual Learning algorithm converges to a critical point.
The method extends to hardware imperfections like limited response granularity.
Abstract
As the economic and environmental costs of training and deploying large vision or language models increase dramatically, analog in-memory computing (AIMC) emerges as a promising energy-efficient solution. However, the training perspective, especially its training dynamic, is underexplored. In AIMC hardware, the trainable weights are represented by the conductance of resistive elements and updated using consecutive electrical pulses. While the conductance changes by a constant in response to each pulse, in reality, the change is scaled by asymmetric and non-linear response functions, leading to a non-ideal training dynamic. This paper provides a theoretical foundation for gradient-based training on AIMC hardware with non-ideal response functions. We demonstrate that asymmetric response functions negatively impact Analog SGD by imposing an implicit penalty on the objective. To overcome…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural Networks and Applications
MethodsStochastic Gradient Descent
