# Spatially-Coupled Neural Network Architectures

**Authors:** Arman Hasanzadeh, Nagaraj T. Janakiraman, Vamsi K. Amalladinne,, Krishna R. Narayanan

arXiv: 1907.02051 · 2019-07-04

## TL;DR

This paper introduces a spatially-coupled neural network architecture that reduces training parameters by 94% while maintaining performance, leveraging structured sparsity based on feature importance rather than random dropout.

## Contribution

It proposes a novel structured sparse architecture inspired by spatially-coupled codes, improving parameter efficiency without sacrificing accuracy.

## Key findings

- Achieves 94% reduction in training parameters.
- Maintains comparable performance to fully connected networks.
- Outperforms traditional dropout and regularization methods.

## Abstract

In this work, we leverage advances in sparse coding techniques to reduce the number of trainable parameters in a fully connected neural network. While most of the works in literature impose $\ell_1$ regularization, DropOut or DropConnect techniques to induce sparsity, our scheme considers feature importance as a criterion to allocate the trainable parameters (resources) efficiently in the network. Even though sparsity is ensured, $\ell_1$ regularization requires training on all the resources in a deep neural network. The DropOut/DropConnect techniques reduce the number of trainable parameters in the training stage by dropping a random collection of neurons/edges in the hidden layers. However, both these techniques do not pay heed to the underlying structure in the data when dropping the neurons/edges. Moreover, these frameworks require a storage space equivalent to the number of parameters in a fully connected neural network. We address the above issues with a more structured architecture inspired from spatially-coupled sparse constructions. The proposed architecture is shown to have a performance akin to a conventional fully connected neural network with dropouts, and yet achieving a $94\%$ reduction in the training parameters. Extensive simulations are presented and the performance of the proposed scheme is compared against traditional neural network architectures.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.02051/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1907.02051/full.md

## References

13 references — full list in the complete paper: https://tomesphere.com/paper/1907.02051/full.md

---
Source: https://tomesphere.com/paper/1907.02051