Reconstructing words using queries on subwords or factors

Gwena\"el Richomme (UPVM); Matthieu Rosenfeld (UM)

arXiv:2301.01571·cs.DM·January 5, 2023

Reconstructing words using queries on subwords or factors

Gwena\"el Richomme (UPVM), Matthieu Rosenfeld (UM)

PDF

Open Access

TL;DR

This paper improves bounds on reconstructing an unknown word from subword or factor occurrence data, reducing the number of queries needed for accurate reconstruction over an alphabet.

Contribution

It presents tighter bounds for word reconstruction from subword and factor information, extending previous results with improved query complexity estimates.

Findings

01

Reconstruction from subword occurrence counts requires O(k^2√(n log n)) words.

02

Bounds are slightly improved for existence-based subword information.

03

Bounds are improved for factor existence information.

Abstract

We study word reconstruction problems. Improving a previous result by P. Fleischmann, M. Lejeune, F. Manea, D. Nowotka and M. Rigo, we prove that, for any unknown word $w$ of length $n$ over an alphabet of cardinality $k$ , $w$ can be reconstructed from the number of occurrences as subwords (or scattered factors) of $O (k^{2} n lo g_{2} (n))$ words. Two previous upper bounds obtained by S. S. Skiena and G. Sundaram are also slightly improved: one when considering information on the existence of subwords instead of on the numbers of their occurrences, and, the other when considering information on the existence of factors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · semigroups and automata theory · Genome Rearrangement Algorithms