Ensemble Kalman Inversion: A Derivative-Free Technique For Machine   Learning Tasks

Nikola B. Kovachki; Andrew M. Stuart

arXiv:1808.03620·cs.LG·September 4, 2019

Ensemble Kalman Inversion: A Derivative-Free Technique For Machine Learning Tasks

Nikola B. Kovachki, Andrew M. Stuart

PDF

TL;DR

This paper introduces an ensemble Kalman inversion (EKI) method for machine learning that is derivative-free, efficient, and applicable to various supervised and semi-supervised learning tasks, showing robustness across different applications.

Contribution

It formulates machine learning tasks as inverse problems and proposes a novel, gradient-free EKI algorithm with practical modifications, expanding the toolkit beyond gradient-based methods.

Findings

01

EKI is effective for deep neural network training.

02

The method demonstrates robustness across different learning tasks.

03

Numerical experiments show wide applicability of the approach.

Abstract

The standard probabilistic perspective on machine learning gives rise to empirical risk-minimization tasks that are frequently solved by stochastic gradient descent (SGD) and variants thereof. We present a formulation of these tasks as classical inverse or filtering problems and, furthermore, we propose an efficient, gradient-free algorithm for finding a solution to these problems using ensemble Kalman inversion (EKI). Applications of our approach include offline and online supervised learning with deep neural networks, as well as graph-based semi-supervised learning. The essence of the EKI procedure is an ensemble based approximate gradient descent in which derivatives are replaced by differences from within the ensemble. We suggest several modifications to the basic method, derived from empirically successful heuristics developed in the context of SGD. Numerical results demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsStochastic Gradient Descent