A Faster Generalized Two-Stage Approximate Top-K

Yashas Samaga; Varun Yerram; Spandana Raj Babbula; Prateek Jain; Praneeth Netrapalli

arXiv:2506.04165·cs.LG·May 14, 2026

A Faster Generalized Two-Stage Approximate Top-K

Yashas Samaga, Varun Yerram, Spandana Raj Babbula, Prateek Jain, Praneeth Netrapalli

PDF

TL;DR

This paper generalizes a two-stage approximate Top-K selection algorithm to improve efficiency and speed on accelerators, providing theoretical bounds and demonstrating significant speedups on Cloud TPUv5e.

Contribution

It introduces a generalized first stage selecting multiple top elements per partition, with theoretical analysis and practical implementation showing improved speed and maintained recall.

Findings

01

Expected recall bounds are tighter than previous work.

02

Choosing larger K' reduces input size more effectively.

03

Achieves ~10x speedup on Cloud TPUv5e without losing recall.

Abstract

We consider the Top- $K$ selection problem, which aims to identify the largest $K$ elements in an array. Top- $K$ selection arises in many machine learning algorithms and often becomes a bottleneck on accelerators, which are optimized for dense matrix multiplications. To address this problem, Chern et al. (2022) proposed a fast two-stage approximate Top- $K$ algorithm that: (i) partitions the input array into equal-sized chunks and selects the top- $1$ element from each partition; and (ii) sorts the resulting smaller subset and returns the top $K$ elements. In this paper, we generalize the first stage so that each partition selects the top $K^{'}$ elements (for $1 \leq K^{'} \leq K$ ). Our contributions include: (i) an expression for the expected recall of this generalized algorithm under random partitioning, and a demonstration that choosing $K^{'} > 1$ with fewer partitions in the first stage more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.