Reducing Memory Requirements for the IPU using Butterfly Factorizations

S.-Kazem Shekofteh; Christian Alles; Holger Fr\"oning

arXiv:2309.08946·cs.DC·September 19, 2023

Reducing Memory Requirements for the IPU using Butterfly Factorizations

S.-Kazem Shekofteh, Christian Alles, Holger Fr\"oning

PDF

Open Access

TL;DR

This paper explores implementing butterfly factorizations on IPUs to significantly reduce memory requirements and improve performance in machine learning tasks, demonstrating high compression ratios and speedups over GPUs.

Contribution

It introduces how butterfly structures can be adapted for IPUs, showing substantial memory reduction and performance improvements compared to GPU implementations.

Findings

01

98.5% compression ratio achieved

02

IPU benefits from 1.3x to 1.6x performance improvement

03

1.62x training time speedup on CIFAR10

Abstract

High Performance Computing (HPC) benefits from different improvements during last decades, specially in terms of hardware platforms to provide more processing power while maintaining the power consumption at a reasonable level. The Intelligence Processing Unit (IPU) is a new type of massively parallel processor, designed to speedup parallel computations with huge number of processing cores and on-chip memory components connected with high-speed fabrics. IPUs mainly target machine learning applications, however, due to the architectural differences between GPUs and IPUs, especially significantly less memory capacity on an IPU, methods for reducing model size by sparsification have to be considered. Butterfly factorizations are well-known replacements for fully-connected and convolutional layers. In this paper, we examine how butterfly structures can be implemented on an IPU and study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Error Correcting Code Techniques · Advanced Data Compression Techniques