TL;DR
This paper introduces a Bayesian framework for learning parsimonious variable-order Markov models, called Bayesian context trees, which efficiently capture complex dependencies in categorical sequences for improved prediction.
Contribution
It presents a novel Bayesian modeling approach with conjugate priors for context trees, reducing parameters and enabling efficient approximate inference for complex sequence data.
Findings
Outperforms existing models on protein sequences
Effective in real-time data stream processing
Reduces model complexity while capturing dependencies
Abstract
Models for categorical sequences typically assume exchangeable or first-order dependent sequence elements. These are common assumptions, for example, in models of computer malware traces and protein sequences. Although such simplifying assumptions lead to computational tractability, these models fail to capture long-range, complex dependence structures that may be harnessed for greater predictive power. To this end, a Bayesian modelling framework is proposed to parsimoniously capture rich dependence structures in categorical sequences, with memory efficiency suitable for real-time processing of data streams. Parsimonious Bayesian context trees are introduced as a form of variable-order Markov model with conjugate prior distributions. The novel framework requires fewer parameters than fixed-order Markov models by dropping redundant dependencies and clustering sequential contexts.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
