Efficient algorithms for training the parameters of hidden Markov models   using stochastic expectation maximization EM training and Viterbi training

Tin Yin Lam; Irmtraud M. Meyer

arXiv:0909.0737·q-bio.QM·October 18, 2012

Efficient algorithms for training the parameters of hidden Markov models using stochastic expectation maximization EM training and Viterbi training

Tin Yin Lam, Irmtraud M. Meyer

PDF

TL;DR

This paper presents two new efficient algorithms for training hidden Markov models that reduce memory usage and simplify implementation, enabling more complex models and longer sequences in bioinformatics applications.

Contribution

The authors introduce one-step, single-pass algorithms for Viterbi and stochastic EM training that are memory-efficient and easier to implement than existing methods.

Findings

01

Memory requirements are independent of sequence length.

02

Algorithms successfully tested on small bioinformatics models.

03

Enhanced computational efficiency for training complex HMMs.

Abstract

Background: Hidden Markov models are widely employed by numerous bioinformatics programs used today. Applications range widely from comparative gene prediction to time-series analyses of micro-array data. The parameters of the underlying models need to be adjusted for specific data sets, for example the genome of a particular species, in order to maximize the prediction accuracy. Computationally efficient algorithms for parameter training are thus key to maximizing the usability of a wide range of bioinformatics applications. Results: We introduce two computationally efficient training algorithms, one for Viterbi training and one for stochastic expectation maximization (EM) training, which render the memory requirements independent of the sequence length. Unlike the existing algorithms for Viterbi and stochastic EM training which require a two-step procedure, our two new algorithms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.