The Fine-Grained Complexity of Episode Matching

Philip Bille; Inge Li G{\o}rtz; Shay Mozes; Teresa Anna Steiner; Oren; Weimann

arXiv:2108.08613·cs.DS·February 15, 2024

The Fine-Grained Complexity of Episode Matching

Philip Bille, Inge Li G{\o}rtz, Shay Mozes, Teresa Anna Steiner, Oren, Weimann

PDF

Open Access

TL;DR

This paper establishes the computational hardness of the Episode Matching problem, proves tight bounds for data structures solving it, and introduces faster algorithms for specific cases, advancing understanding of its complexity.

Contribution

The paper proves SETH-based lower bounds for Episode Matching, provides near-optimal data structures for indexing, and offers a faster solution for the case when pattern length is two.

Findings

01

No sub-quadratic algorithm exists under SETH.

02

A space-efficient data structure answers queries in near-logarithmic time.

03

Faster algorithms are developed for patterns of length two.

Abstract

Given two strings $S$ and $P$ , the Episode Matching problem is to find the shortest substring of $S$ that contains $P$ as a subsequence. The best known upper bound for this problem is $\tilde{O} (nm)$ by Das et al. (1997) , where $n, m$ are the lengths of $S$ and $P$ , respectively. Although the problem is well studied and has many applications in data mining, this bound has never been improved. In this paper we show why this is the case by proving that no $O ((nm)^{1 - ϵ})$ algorithm (even for binary strings) exists, unless the Strong Exponential Time Hypothesis (SETH) is false. We then consider the indexing version of the problem, where $S$ is preprocessed into a data structure for answering episode matching queries $P$ . We show that for any $τ$ , there is a data structure using $O (n + (\frac{n}{τ})^{k})$ space that answers episode matching queries for any $P$ of length…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Data Mining Algorithms and Applications · Machine Learning and Algorithms