Learning Maximum Entropy Models from finite size datasets: a fast   Data-Driven algorithm allows sampling from the posterior distribution

Ulisse Ferrari

arXiv:1507.04254·cond-mat.dis-nn·September 21, 2016

Learning Maximum Entropy Models from finite size datasets: a fast Data-Driven algorithm allows sampling from the posterior distribution

Ulisse Ferrari

PDF

TL;DR

This paper introduces a fast, data-driven algorithm for learning maximum entropy models from finite datasets, which efficiently samples from the posterior distribution and improves over traditional methods by accounting for parameter space curvature.

Contribution

The authors develop a rectified, data-driven learning algorithm that optimizes maximum entropy models by incorporating dataset properties and sampling from the posterior distribution.

Findings

01

The new algorithm outperforms steepest descent in learning pairwise Ising models.

02

It efficiently samples from the posterior, reducing over- and under-fitting.

03

The method is validated on neural data from retina recordings.

Abstract

Maximum entropy models provide the least constrained probability distributions that reproduce statistical properties of experimental datasets. In this work we characterize the learning dynamics that maximizes the log-likelihood in the case of large but finite datasets. We first show how the steepest descent dynamics is not optimal as it is slowed down by the inhomogeneous curvature of the model parameters space. We then provide a way for rectifying this space which relies only on dataset properties and does not require large computational efforts. We conclude by solving the long-time limit of the parameters dynamics including the randomness generated by the systematic use of Gibbs sampling. In this stochastic framework, rather than converging to a fixed point, the dynamics reaches a stationary distribution, which for the rectified dynamics reproduces the posterior distribution of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.