Recovering a Message from an Incomplete Set of Noisy Fragments

Aditya Narayan Ravi; Alireza Vahid; Ilan Shomorony

arXiv:2407.05544·cs.IT·July 9, 2024

Recovering a Message from an Incomplete Set of Noisy Fragments

Aditya Narayan Ravi, Alireza Vahid, Ilan Shomorony

PDF

Open Access

TL;DR

This paper analyzes the capacity of a novel torn-paper channel model, characterizing how well information can be recovered from incomplete, shuffled, and noisy message fragments, with applications in molecular data storage and forensics.

Contribution

It provides a closed-form expression for the channel capacity considering arbitrary fragment lengths and deletion probabilities, extending to noisy fragments with capacity bounds.

Findings

01

Capacity is given by a formula involving coverage and alignment cost.

02

Bounds for noisy fragments are derived and match under certain conditions.

03

The model applies to molecular storage and forensic data reconstruction.

Abstract

We consider the problem of communicating over a channel that breaks the message block into fragments of random lengths, shuffles them out of order, and deletes a random fraction of the fragments. Such a channel is motivated by applications in molecular data storage and forensics, and we refer to it as the torn-paper channel. We characterize the capacity of this channel under arbitrary fragment length distributions and deletion probabilities. Precisely, we show that the capacity is given by a closed-form expression that can be interpreted as F - A, where F is the coverage fraction ,i.e., the fraction of the input codeword that is covered by output fragments, and A is an alignment cost incurred due to the lack of ordering in the output fragments. We then consider a noisy version of the problem, where the fragments are corrupted by binary symmetric noise. We derive upper and lower bounds…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression