An $\tilde{O}(\frac{1}{\sqrt{T}})$-error online algorithm for retrieving   heavily perturbated statistical databases in the low-dimensional querying   mode

Krzysztof Choromanski; Afshin Rostamizadeh; Umar Syed

arXiv:1504.01117·cs.DB·April 7, 2015

An $\tilde{O}(\frac{1}{\sqrt{T}})$-error online algorithm for retrieving heavily perturbated statistical databases in the low-dimensional querying mode

Krzysztof Choromanski, Afshin Rostamizadeh, Umar Syed

PDF

Open Access

TL;DR

This paper introduces a novel online algorithm that accurately reconstructs noisy, low-dimensional statistical databases with minimal memory, achieving an error rate of O(1/\u221a{T}) in a streaming setting.

Contribution

It presents the first O(1/0) error online algorithm for reconstructing heavily perturbed databases using only logarithmic memory.

Findings

01

Achieves O(1/0) average error in T queries

02

Operates with only O(00log T) memory

03

Handles high noise levels of O(D) in the data

Abstract

We give the first $\tilde{O} (\frac{1}{T})$ -error online algorithm for reconstructing noisy statistical databases, where $T$ is the number of (online) sample queries received. The algorithm, which requires only $O (lo g T)$ memory, aims to learn a hidden database-vector $w^{*} \in R^{D}$ in order to accurately answer a stream of queries regarding the hidden database, which arrive in an online fashion from some unknown distribution $D$ . We assume the distribution $D$ is defined on the neighborhood of a low-dimensional manifold. The presented algorithm runs in $O (d D)$ -time per query, where $d$ is the dimensionality of the query-space. Contrary to the classical setting, there is no separate training set that is used by the algorithm to learn the database --- the stream on which the algorithm will be evaluated must also be used to learn the database-vector.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification