Which scaling rule applies to Artificial Neural Networks
J\'anos V\'egh

TL;DR
This paper analyzes the scaling behavior of Artificial Neural Networks, emphasizing the impact of data transfer and communication bottlenecks, and demonstrates that Amdahl's law accurately models their performance limitations.
Contribution
It introduces a new interpretation of Amdahl's law for ANNs, highlighting the critical role of data transfer time and communication structure in their scalability.
Findings
Data transfer time significantly limits ANN performance.
Amdahl's law accurately models ANN scaling behavior.
Communication bottlenecks constrain large ANN systems.
Abstract
The experience shows that cooperating and communicating computing systems, comprising segregated single processors, have severe performance limitations. In his classic "First Draft" von Neumann warned that using a "too fast processor" vitiates his simple "procedure" (but not his computing model!); furthermore, that using the classic computing paradigm for imitating neuronal operations, is unsound. Amdahl added that large machines, comprising many processors, have an inherent disadvantage. Given that ANN's components are heavily communicating with each other, they are built from a large number of components designed/fabricated for use in conventional computing, furthermore they attempt to mimic biological operation using improper technological solutions, their achievable payload computing performance is conceptually modest. The type of workload that AI-based systems generate leads to an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
