# Quotient Hash Tables - Efficiently Detecting Duplicates in Streaming   Data

**Authors:** R\'emi G\'eraud, Marius Lombard-Platet, David Naccache

arXiv: 1901.04358 · 2019-01-15

## TL;DR

This paper introduces the Quotient Hash Table (QHT), a novel data structure for efficient duplicate detection in streaming data, achieving significant memory savings and improved analysis over previous methods.

## Contribution

The paper presents the QHT and QQHTD data structures, offering a corrected analysis, reduced memory usage, and insights into adversarial input effects for hash-based duplicate filters.

## Key findings

- 33% reduction in memory usage compared to previous methods
- Thorough analysis of QHT and SQF algorithms
- Discussion on adversarial input impacts

## Abstract

This article presents the Quotient Hash Table (QHT) a new data structure for duplicate detection in unbounded streams. QHTs stem from a corrected analysis of streaming quotient filters (SQFs), resulting in a 33\% reduction in memory usage for equal performance. We provide a new and thorough analysis of both algorithms, with results of interest to other existing constructions.   We also introduce an optimised version of our new data structure dubbed Queued QHT with Duplicates (QQHTD).   Finally we discuss the effect of adversarial inputs for hash-based duplicate filters similar to QHT.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.04358/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1901.04358/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1901.04358/full.md

---
Source: https://tomesphere.com/paper/1901.04358