Efficient File Synchronization: a Distributed Source Coding Approach
Nan Ma, Kannan Ramchandran, David Tse

TL;DR
This paper investigates efficient file synchronization by modeling bursty deletions with a Markov chain, deriving the minimum information rate needed for accurate reconstruction considering practical side-information mis-synchronization.
Contribution
It introduces a novel information-theoretic framework for distributed source coding with bursty deletions, providing explicit rate characterizations and asymptotic expansions.
Findings
Derived the minimum rate for source reconstruction with bursty deletions.
Provided an asymptotic expansion for small deletion probabilities.
Interpreted the rate in terms of information about deleted content and locations.
Abstract
The problem of reconstructing a source sequence with the presence of decoder side-information that is mis-synchronized to the source due to deletions is studied in a distributed source coding framework. Motivated by practical applications, the deletion process is assumed to be bursty and is modeled by a Markov chain. The minimum rate needed to reconstruct the source sequence with high probability is characterized in terms of an information theoretic expression, which is interpreted as the amount of information of the deleted content and the locations of deletions, subtracting "nature's secret", that is, the uncertainty of the locations given the source and side-information. For small bursty deletion probability, the asymptotic expansion of the minimum rate is computed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
