On the Convergence of Bound Optimization Algorithms
Ruslan R Salakhutdinov, Sam T Roweis, Zoubin Ghahramani

TL;DR
This paper analyzes the convergence behavior of bound optimization algorithms like EM, identifying conditions for slow or fast convergence, and demonstrates how data preprocessing can significantly enhance their performance.
Contribution
It provides a theoretical framework linking bound optimization and gradient methods, and offers practical strategies for improving convergence speed.
Findings
Bound optimization can exhibit quasi-Newton behavior under certain conditions.
Poor convergence occurs when bound optimization behaves like a first-order method.
Data preprocessing can dramatically improve the convergence speed of bound optimizers.
Abstract
Many practitioners who use the EM algorithm complain that it is sometimes slow. When does this happen, and what can be done about it? In this paper, we study the general class of bound optimization algorithms - including Expectation-Maximization, Iterative Scaling and CCCP - and their relationship to direct optimization algorithms such as gradient-based methods for parameter learning. We derive a general relationship between the updates performed by bound optimization methods and those of gradient and second-order methods and identify analytic conditions under which bound optimization algorithms exhibit quasi-Newton behavior, and conditions under which they possess poor, first-order convergence. Based on this analysis, we consider several specific algorithms, interpret and analyze their convergence properties and provide some recipes for preprocessing input to these algorithms to yield…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Sparse and Compressive Sensing Techniques · Blind Source Separation Techniques
