Reconciling Similar Sets of Data
Ryan Gabrys, Farzad Farnoud (Hassanzadeh)

TL;DR
This paper addresses the problem of efficiently synchronizing two data sets with small differences, especially when differing elements are close in Hamming distance, by establishing bounds and providing algorithms.
Contribution
It introduces bounds and explicit algorithms for data set synchronization considering small symmetric differences and Hamming distance relations.
Findings
Derived bounds on minimal information exchange for synchronization
Provided explicit encoding and decoding algorithms for various cases
Analyzed the impact of Hamming distance on synchronization efficiency
Abstract
In this work, we consider the problem of synchronizing two sets of data where the size of the symmetric difference between the sets is small and, in addition, the elements in the symmetric difference are related through the Hamming distance metric. Upper and lower bounds are derived on the minimum amount of information exchange. Furthermore, explicit encoding and decoding algorithms are provided for many cases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
