# Speeding-up the Verification Phase of Set Similarity Joins in the GPGPU   paradigm

**Authors:** Christos Bellas, Anastasios Gounaris

arXiv: 1812.09141 · 2018-12-24

## TL;DR

This paper presents a GPU-accelerated approach to speed up the verification phase of set similarity joins, achieving up to 2.6 times faster performance by optimizing data handling and parallel processing.

## Contribution

It introduces a novel GPU-based verification method for set similarity joins that fully overlaps with CPU tasks, significantly improving speed.

## Key findings

- Achieved up to 2.6X speed-up in verification phase
- Optimized data serialization and thread management for GPU
- Demonstrated maximum potential of GPU acceleration in experiments

## Abstract

We investigate the problem of exact set similarity joins using a co-process CPU-GPU scheme. The state-of-the-art CPU solutions split the wok in two main phases. First, filtering and index building takes place to reduce the candidate sets to be compared as much as possible; then the pairs are compared to verify whether they should become part of the result. We investigate in-depth solutions for transferring the second, so-called verification phase, to the GPU addressing several challenges regarding the data serialization and layout, the thread management and the techniques to compare sets of tokens. Using real datasets, we provide concrete experimental proofs that our solutions have reached their maximum potential, since they totally overlap verification with CPU tasks, and manage to yield significant speed-ups, up to 2.6X in our cases.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.09141/full.md

## Figures

48 figures with captions in the complete paper: https://tomesphere.com/paper/1812.09141/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1812.09141/full.md

---
Source: https://tomesphere.com/paper/1812.09141