Convergence Analysis of Block Coordinate Algorithms with Determinantal   Sampling

Mojm\'ir Mutn\'y; Micha{\l} Derezi\'nski; Andreas Krause

arXiv:1910.11561·math.NA·February 13, 2020·5 cites

Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling

Mojm\'ir Mutn\'y, Micha{\l} Derezi\'nski, Andreas Krause

PDF

Open Access

TL;DR

This paper provides a convergence analysis of a randomized Newton-like method using determinantal sampling, revealing how eigenvalues influence convergence and introducing a new expectation formula for determinantal point processes.

Contribution

It introduces a novel convergence analysis for block coordinate algorithms with determinantal sampling, linking convergence rates to eigenvalue spectra and deriving a new expectation formula.

Findings

01

Determinantal sampling's convergence depends only on the eigenvalue distribution of matrix M.

02

The new expectation formula for determinantal point processes facilitates analysis of subset sizes.

03

Numerical results show determinantal sampling can outperform uniform sampling in certain cases.

Abstract

We analyze the convergence rate of the randomized Newton-like method introduced by Qu et. al. (2016) for smooth and convex objectives, which uses random coordinate blocks of a Hessian-over-approximation matrix $\bM$ instead of the true Hessian. The convergence analysis of the algorithm is challenging because of its complex dependence on the structure of $\bM$ . However, we show that when the coordinate blocks are sampled with probability proportional to their determinant, the convergence rate depends solely on the eigenvalue distribution of matrix $\bM$ , and has an analytically tractable form. To do so, we derive a fundamental new expectation formula for determinantal point processes. We show that determinantal sampling allows us to reason about the optimal subset size of blocks in terms of the spectrum of $\bM$ . Additionally, we provide a numerical evaluation of our analysis,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods