Hidden Markov Models with Momentum
Andrew Miller, Fabio Di Troia, Mark Stamp

TL;DR
This paper explores the impact of incorporating momentum into the Baum-Welch algorithm for Hidden Markov Models, showing it accelerates initial convergence but does not enhance final accuracy.
Contribution
It introduces and evaluates the use of momentum in Baum-Welch training for HMMs, a novel approach not previously studied.
Findings
Momentum reduces initial convergence time.
Final model performance remains unchanged with momentum.
Effectiveness varies with convergence speed.
Abstract
Momentum is a popular technique for improving convergence rates during gradient descent. In this research, we experiment with adding momentum to the Baum-Welch expectation-maximization algorithm for training Hidden Markov Models. We compare discrete Hidden Markov Models trained with and without momentum on English text and malware opcode data. The effectiveness of momentum is determined by measuring the changes in model score and classification accuracy due to momentum. Our extensive experiments indicate that adding momentum to Baum-Welch can reduce the number of iterations required for initial convergence during HMM training, particularly in cases where the model is slow to converge. However, momentum does not seem to improve the final model performance at a high number of iterations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Machine Learning and Algorithms · Spam and Phishing Detection
