Cross-Document Pattern Matching

Gregory Kucherov; Yakov Nekrich; and Tatiana Starikovskaya

arXiv:1202.4076·cs.DS·June 21, 2012

Cross-Document Pattern Matching

Gregory Kucherov, Yakov Nekrich, and Tatiana Starikovskaya

PDF

TL;DR

This paper introduces a new cross-document string matching problem, providing efficient linear-space solutions with minimal dependence on pattern size, and also improves the weighted level ancestor problem.

Contribution

It defines the cross-document string matching problem and offers novel linear-space algorithms with near-constant query times, advancing string matching techniques.

Findings

01

Efficient linear-space algorithms for cross-document string matching.

02

Query times are independent or weakly dependent on pattern size.

03

Improved solution to the weighted level ancestor problem.

Abstract

We study a new variant of the string matching problem called cross-document string matching, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern itself is a substring of another document. Several variants of this problem are considered, and efficient linear-space solutions are proposed with query time bounds that either do not depend at all on the pattern size or depend on it in a very limited way (doubly logarithmic). As a side result, we propose an improved solution to the weighted level ancestor problem.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.