Efficient Multiplication of Dense Matrices over GF(2)

Martin Albrecht; Gregory Bard; William Hart

arXiv:0811.1714·cs.MS·March 27, 2012

Efficient Multiplication of Dense Matrices over GF(2)

Martin Albrecht, Gregory Bard, William Hart

PDF

2 Repos

TL;DR

This paper presents optimized algorithms for dense matrix multiplication over GF(2), including implementations of Strassen-Winograd and the Method of the Four Russians, demonstrating high performance on modern CPUs.

Contribution

The paper introduces an efficient implementation of matrix multiplication algorithms over GF(2) in the M4RI library, emphasizing performance optimization and practical benchmarking.

Findings

01

High performance on AMD Opteron and Intel Core 2 Duo processors

02

Memory access and data locality are the main bottlenecks

03

Parallel bitwise operations enable fast computations over GF(2)

Abstract

We describe an efficient implementation of a hierarchy of algorithms for multiplication of dense matrices over the field with two elements (GF(2)). In particular we present our implementation -- in the M4RI library -- of Strassen-Winograd matrix multiplication and the "Method of the Four Russians" multiplication (M4RM) and compare it against other available implementations. Good performance is demonstrated on on AMD's Opteron and particulary good performance on Intel's Core 2 Duo. The open-source M4RI library is available stand-alone as well as part of the Sage mathematics software. In machine terms, addition in GF(2) is logical-XOR, and multiplication is logical-AND, thus a machine word of 64-bits allows one to operate on 64 elements of GF(2) in parallel: at most one CPU cycle for 64 parallel additions or multiplications. As such, element-wise operations over GF(2) are relatively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.