Efficient stream-based Max-Min diversification with minimal failure rate
Argyris Kalogeratos, Yutai Nazir Zhao, Mathilde Fekom

TL;DR
This paper introduces a new streaming algorithm called Failure Rate Minimization (FRM) that efficiently selects diverse items with minimal failure rate, improving max-min diversification in data streams where decisions are irrevocable.
Contribution
The paper presents FRM, a novel rank-based algorithm that reduces failure rates in streaming max-min diversification, addressing a gap in existing methods.
Findings
FRM significantly reduces failure probability compared to existing strategies.
Simulation results demonstrate FRM's superior performance in diverse streaming scenarios.
FRM effectively balances diversity and decision immediacy in streaming selection.
Abstract
The streaming max-min diversification problem concerns the selection of a limited and diverse sample of items out of a data stream of known finite length. The objective to be maximized is the minimum distance among any pair of selected items. We consider the irrevocable-choice sampling, where decisions need to be immediate and irrevocable while processing the items of the stream, which is a setting little studied in the literature. Standard algorithmic approaches for sequential selection disregard selection failures, which is when the last items of the stream are picked by default, to prevent delivering an incomplete selection set. This defect can be catastrophic for the max-min diversification objective. The proposed Failure Rate Minimization (FRM) is a rank-based algorithm that selects a set of diverse items and, in addition, reduces significantly the probability of having failures.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Data Stream Mining Techniques · Machine Learning and Data Classification
