# Compressing deep quaternion neural networks with targeted regularization

**Authors:** Riccardo Vecchi, Simone Scardapane, Danilo Comminiello, Aurelio Uncini

arXiv: 1907.11546 · 2020-07-14

## TL;DR

This paper introduces targeted regularization methods for quaternion neural networks, effectively reducing their size and overfitting, which enhances their suitability for low-power and real-time applications.

## Contribution

It proposes novel quaternion-specific regularization strategies that sparsify networks during training, addressing a gap in existing literature.

## Key findings

- Tailored regularization outperforms classical approaches in quaternion networks.
- Significantly smaller quaternion networks achieved for real-time applications.
- Enhanced generalization and reduced overfitting in QVNNs.

## Abstract

In recent years, hyper-complex deep networks (such as complex-valued and quaternion-valued neural networks) have received a renewed interest in the literature. They find applications in multiple fields, ranging from image reconstruction to 3D audio processing. Similar to their real-valued counterparts, quaternion neural networks (QVNNs) require custom regularization strategies to avoid overfitting. In addition, for many real-world applications and embedded implementations, there is the need of designing sufficiently compact networks, with few weights and neurons. However, the problem of regularizing and/or sparsifying QVNNs has not been properly addressed in the literature as of now. In this paper, we show how to address both problems by designing targeted regularization strategies, which are able to minimize the number of connections and neurons of the network during training. To this end, we investigate two extensions of l1 and structured regularization to the quaternion domain. In our experimental evaluation, we show that these tailored strategies significantly outperform classical (real-valued) regularization approaches, resulting in small networks especially suitable for low-power and real-time applications.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.11546/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1907.11546/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1907.11546/full.md

---
Source: https://tomesphere.com/paper/1907.11546