# Streaming Binary Sketching based on Subspace Tracking and Diagonal   Uniformization

**Authors:** Anne Morvan, Antoine Souloumiac, C\'edric Gouy-Pailler, Jamal, Atif

arXiv: 1705.07661 · 2018-02-12

## TL;DR

This paper introduces an online method for generating binary sketches from high-dimensional streaming data, enabling efficient similarity search without storing the entire dataset.

## Contribution

The proposed algorithm provides a fully online, memory-efficient way to produce binary embeddings with convergence guarantees for streaming high-dimensional data.

## Key findings

- Effective binary sketches for high-dimensional streams.
- Low time complexity and no need for dataset storage.
- Successful experiments on real data for nearest neighbor search.

## Abstract

In this paper, we address the problem of learning compact similarity-preserving embeddings for massive high-dimensional streams of data in order to perform efficient similarity search. We present a new online method for computing binary compressed representations -sketches- of high-dimensional real feature vectors. Given an expected code length $c$ and high-dimensional input data points, our algorithm provides a $c$-bits binary code for preserving the distance between the points from the original high-dimensional space. Our algorithm does not require neither the storage of the whole dataset nor a chunk, thus it is fully adaptable to the streaming setting. It also provides low time complexity and convergence guarantees. We demonstrate the quality of our binary sketches through experiments on real data for the nearest neighbors search task in the online setting.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.07661/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1705.07661/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1705.07661/full.md

---
Source: https://tomesphere.com/paper/1705.07661