Learning the Information Divergence

Onur Dikmen; Zhirong Yang; Erkki Oja

arXiv:1406.1385·cs.LG·June 6, 2014·1 cites

Learning the Information Divergence

Onur Dikmen, Zhirong Yang, Erkki Oja

PDF

Open Access

TL;DR

This paper introduces a framework for automatically selecting the most suitable information divergence for machine learning tasks by reformulating divergence families and applying maximum likelihood estimation, improving divergence choice accuracy.

Contribution

It proposes a novel approach to automatically select optimal divergences among families using maximum likelihood, including reformulations and connections between divergence types.

Findings

01

Accurately selects divergences across various tasks

02

Demonstrates effectiveness on synthetic and real data

03

Extends framework to non-separable divergences

Abstract

Information divergence that measures the difference between two nonnegative matrices or tensors has found its use in a variety of machine learning problems. Examples are Nonnegative Matrix/Tensor Factorization, Stochastic Neighbor Embedding, topic models, and Bayesian network optimization. The success of such a learning task depends heavily on a suitable divergence. A large variety of divergences have been suggested and analyzed, but very few results are available for an objective choice of the optimal divergence for a given task. Here we present a framework that facilitates automatic selection of the best divergence among a given family, based on standard maximum likelihood estimation. We first propose an approximated Tweedie distribution for the beta-divergence family. Selecting the best beta then becomes a machine learning problem solved by maximum likelihood. Next, we reformulate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Gaussian Processes and Bayesian Inference · Machine Learning and Data Classification