TL;DR
This paper introduces deep quaternion networks, extending deep learning into hyper-complex numbers, demonstrating improved convergence and efficiency on image classification and segmentation tasks.
Contribution
It develops the theoretical foundation and architecture components for deep quaternion networks, including quaternion convolutions, weight initialization, and batch-normalization algorithms.
Findings
Quaternion networks converge faster than real and complex networks.
Quaternion networks use fewer parameters for similar or better performance.
Significant improvements observed in segmentation tasks.
Abstract
The field of deep learning has seen significant advancement in recent years. However, much of the existing work has been focused on real-valued numbers. Recent work has shown that a deep learning system using the complex numbers can be deeper for a fixed parameter budget compared to its real-valued counterpart. In this work, we explore the benefits of generalizing one step further into the hyper-complex numbers, quaternions specifically, and provide the architecture components needed to build deep quaternion networks. We develop the theoretical basis by reviewing quaternion convolutions, developing a novel quaternion weight initialization scheme, and developing novel algorithms for quaternion batch-normalization. These pieces are tested in a classification model by end-to-end training on the CIFAR-10 and CIFAR-100 data sets and a segmentation model by end-to-end training on the KITTI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
