Self-Bounded Prediction Suffix Tree via Approximate String Matching

Dongwoo Kim; Christian Walder

arXiv:1802.03184·cs.LG·August 8, 2018

Self-Bounded Prediction Suffix Tree via Approximate String Matching

Dongwoo Kim, Christian Walder

PDF

Open Access

TL;DR

This paper introduces a novel prediction suffix tree algorithm that uses approximate suffix matching and self-bounds its depth, leading to improved sequence prediction performance on various datasets.

Contribution

It presents a provably correct algorithm for PST with approximate matching and a self-bounded growth mechanism based on model performance.

Findings

01

Better predictive accuracy than existing PST variants

02

Effective on both synthetic and real-world datasets

03

Automatic depth adjustment improves model adaptability

Abstract

Prediction suffix trees (PST) provide an effective tool for sequence modelling and prediction. Current prediction techniques for PSTs rely on exact matching between the suffix of the current sequence and the previously observed sequence. We present a provably correct algorithm for learning a PST with approximate suffix matching by relaxing the exact matching condition. We then present a self-bounded enhancement of our algorithm where the depth of suffix tree grows automatically in response to the model performance on a training sequence. Through experiments on synthetic datasets as well as three real-world datasets, we show that the approximate matching PST results in better predictive performance than the other variants of PST.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · Web Data Mining and Analysis