Weight Sparsity Complements Activity Sparsity in Neuromorphic Language Models
Rishav Mukherji, Mark Sch\"one, Khaleelulla Khan Nazeer, Christian, Mayr, David Kappel, Anand Subramoney

TL;DR
This paper investigates how combining activity sparsity and weight pruning in neuromorphic language models enhances computational efficiency without significantly compromising performance, demonstrating promising results on language modeling tasks.
Contribution
It provides the first detailed analysis of weight pruning effects on neuromorphic language models, showing how sparsity methods complement each other for sequence tasks.
Findings
Sparse activity and connectivity improve efficiency.
Combining sparsities maintains performance on language tasks.
Event-based models are promising for efficient sequence modeling.
Abstract
Activity and parameter sparsity are two standard methods of making neural networks computationally more efficient. Event-based architectures such as spiking neural networks (SNNs) naturally exhibit activity sparsity, and many methods exist to sparsify their connectivity by pruning weights. While the effect of weight pruning on feed-forward SNNs has been previously studied for computer vision tasks, the effects of pruning for complex sequence tasks like language modeling are less well studied since SNNs have traditionally struggled to achieve meaningful performance on these tasks. Using a recently published SNN-like architecture that works well on small-scale language modeling, we study the effects of weight pruning when combined with activity sparsity. Specifically, we study the trade-off between the multiplicative efficiency gains the combination affords and its effect on task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage Development and Disorders · Neurobiology of Language and Bilingualism · Ferroelectric and Negative Capacitance Devices
MethodsPruning
