Efficient Computation of Positional Population Counts Using SIMD   Instructions

Marcus D. R. Klarqvist; Wojciech Mu{\l}a; Daniel Lemire

arXiv:1911.02696·cs.DS·August 19, 2021

Efficient Computation of Positional Population Counts Using SIMD Instructions

Marcus D. R. Klarqvist, Wojciech Mu{\l}a, Daniel Lemire

PDF

1 Repo

TL;DR

This paper introduces SIMD-based algorithms for efficiently computing positional population counts, significantly outperforming traditional methods in speed and instruction count for large datasets.

Contribution

It presents novel SIMD algorithms for fast positional population counts, reducing instruction count and increasing speed compared to baseline methods.

Findings

01

Up to 400 times fewer instructions needed.

02

Up to 50 times faster execution for large inputs.

03

Efficient computation of positional population counts using SIMD.

Abstract

In several fields such as statistics, machine learning, and bioinformatics, categorical variables are frequently represented as one-hot encoded vectors. For example, given 8 distinct values, we map each value to a byte where only a single bit has been set. We are motivated to quickly compute statistics over such encodings. Given a stream of k-bit words, we seek to compute k distinct sums corresponding to bit values at indexes 0, 1, 2, ..., k-1. If the k-bit words are one-hot encoded then the sums correspond to a frequency histogram. This multiple-sum problem is a generalization of the population-count problem where we seek the sum of all bit values. Accordingly, we refer to the multiple-sum problem as a positional population-count. Using SIMD (Single Instruction, Multiple Data) instructions from recent Intel processors, we describe algorithms for computing the 16-bit position population…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lemire/pospopcnt_avx512
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.