Maximal Intersection Queries in Randomized Input Models

Benjamin Hoffmann; Mikhail Lifshits; Yury Lifshits; Dirk Nowotka

arXiv:1004.0092·cs.IR·April 2, 2010

Maximal Intersection Queries in Randomized Input Models

Benjamin Hoffmann, Mikhail Lifshits, Yury Lifshits, Dirk Nowotka

PDF

TL;DR

This paper studies efficient algorithms for maximal intersection queries in set families under randomized models, achieving near-optimal solutions with high probability and revealing threshold phenomena affecting algorithm performance.

Contribution

It introduces two probabilistic models for set families and develops algorithms that find nearly optimal solutions efficiently with high probability.

Findings

01

Algorithms run in logarithmic time relative to family size.

02

High probability of finding near-optimal solutions.

03

Identification of threshold phenomena influencing algorithm efficiency.

Abstract

Consider a family of sets and a single set, called the query set. How can one quickly find a member of the family which has a maximal intersection with the query set? Time constraints on the query and on a possible preprocessing of the set family make this problem challenging. Such maximal intersection queries arise in a wide range of applications, including web search, recommendation systems, and distributing on-line advertisements. In general, maximal intersection queries are computationally expensive. We investigate two well-motivated distributions over all families of sets and propose an algorithm for each of them. We show that with very high probability an almost optimal solution is found in time which is logarithmic in the size of the family. Moreover, we point out a threshold phenomenon on the probabilities of intersecting sets in each of our two input models which leads to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.