Optimal Data Distribution for Big-Data All-to-All Comparison using Finite Projective and Affine Planes
Joanne L. Hall, Wayne Kelly, Yu-Chu Tian

TL;DR
This paper introduces a novel data distribution method for large-scale all-to-all comparison tasks, leveraging geometric and combinatorial structures to optimize data placement and computational load balancing.
Contribution
It proposes a new distributed data distribution framework based on projective and affine planes, improving efficiency for all-to-all comparison problems.
Findings
Achieves minimal data replication
Balances computational load effectively
Outperforms traditional methods in large data sets
Abstract
An All-to-All Comparison problem is where every element of a data set is compared with every other element. This is analogous to projective planes and affine planes where every pair of points share a common line. For large data sets, the comparison computations can be distributed across a cluster of computers. All-to-All Comparison does not fit the highly successful Map-Reduce pattern, so a new distributed computing framework is required. The principal challenge is to distribute the data in such a way that computations can be scheduled where the data already lies. This paper uses projective planes, affine planes and balanced incomplete block designs to design data distributions and schedule computations. The data distributions based on these geometric and combinatorial structures achieve minimal data replication whilst balancing the computational load across the cluster.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterconnection Networks and Systems · Embedded Systems Design Techniques · Parallel Computing and Optimization Techniques
