Bounds on data limits for all-to-all comparison from combinatorial designs
Joanne Hall, Daniel Horsley, Douglas R. Stinson

TL;DR
This paper explores how to efficiently distribute data across machines to ensure every item is compared with every other item, using combinatorial designs and proving new bounds.
Contribution
The paper introduces new bounds on data distribution efficiency and connects ATAC data limits with combinatorial parameters.
Findings
Transversal designs and projective Hjelmslev planes are analyzed for their data limit performance.
A new lower bound on ATAC data limits is proven, improving on previous results.
Special cases where the new bound is tight are identified.
Abstract
In situations where every item in a data set must be compared with every other item in the set, it may be desirable to store the data across a number of machines in such a way that any two data items are stored together on at least one machine. One way to evaluate the efficiency of such a distribution is by the largest fraction of the data it requires to be allocated to any one machine. The all-to-all comparison (ATAC) data limit for m machines is a measure of the minimum of this value across all possible such distributions. In this paper we further the study of ATAC data limits. We begin by investigating the data limits achievable using various classes of combinatorial designs. In particular, we examine the cases of transversal designs and projective Hjelmslev planes. We then observe relationships between data limits and the previously studied combinatorial parameters of fractional…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsgraph theory and CDMA systems · Optimal Experimental Design Methods · Statistical Methods in Clinical Trials
