Top-k data selection via distributed sample quantile inference

Xu Zhang; Marcos Vasconcelos

arXiv:2212.00230·cs.DC·December 2, 2022

Top-k data selection via distributed sample quantile inference

Xu Zhang, Marcos Vasconcelos

PDF

Open Access 1 Repo

TL;DR

This paper introduces a distributed stochastic approximation algorithm for top-$k$ data selection in noisy networked environments, effectively solving the sample quantile inference problem with proven convergence.

Contribution

It presents a novel two-time-scale stochastic approximation method for distributed sample quantile inference, with rigorous convergence guarantees and empirical efficiency.

Findings

01

Algorithm converges almost surely to the optimal solution.

02

Handles noise effectively in distributed settings.

03

Achieves accurate top-$k$ selection within few iterations.

Abstract

We consider the problem of determining the top- $k$ largest measurements from a dataset distributed among a network of $n$ agents with noisy communication links. We show that this scenario can be cast as a distributed convex optimization problem called sample quantile inference, which we solve using a two-time-scale stochastic approximation algorithm. Herein, we prove the algorithm's convergence in the almost sure sense to an optimal solution. Moreover, our algorithm handles noise and empirically converges to the correct answer within a small number of iterations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mullervasconcelos/l4dc23
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Sensor Networks and Detection Algorithms · Statistical Methods and Inference · Privacy-Preserving Technologies in Data