AlgebraNets
Jordan Hoffmann, Simon Schmitt, Simon Osindero, Karen Simonyan, Erich, Elsen

TL;DR
This paper explores neural networks built from a variety of algebraic structures beyond real numbers, demonstrating their potential for improved efficiency and performance on large-scale image and language tasks.
Contribution
It introduces AlgebraNets, a comprehensive study of neural networks using alternative algebras, showing their advantages over traditional real-valued networks on challenging benchmarks.
Findings
Certain algebras outperform real-valued networks in efficiency.
Multiplication in these algebras has higher compute density.
Inducing sparsity in AlgebraNets enhances their practicality.
Abstract
Neural networks have historically been built layerwise from the set of functions in , i.e. with activations and weights/parameters represented by real numbers, . Our work considers a richer set of objects for activations and weights, and undertakes a comprehensive study of alternative algebras as number representations by studying their performance on two challenging problems: large-scale image classification using the ImageNet dataset and language modeling using the enwiki8 and WikiText-103 datasets. We denote this broader class of models as AlgebraNets. Our findings indicate that the conclusions of prior work, which explored neural networks constructed from (complex numbers) and (quaternions) on smaller datasets, do not always transfer to these challenging settings. However, our results demonstrate that there…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification
