Nonextensive information theoretical machine

Chaobing Song; Shu-Tao Xia

arXiv:1604.06153·cs.LG·April 22, 2016

Nonextensive information theoretical machine

Chaobing Song, Shu-Tao Xia

PDF

Open Access

TL;DR

This paper introduces the nonextensive information theoretical machine (NITM), a discriminative model based on Tsallis entropy, unifying various loss functions and generalizing prior regularization with efficient optimization.

Contribution

It proposes NITM, a novel model that unifies margin-based loss functions via Tsallis entropy and extends Gaussian prior to Student-t prior, with efficient training methods.

Findings

01

NITM can unify several margin-based loss functions.

02

The model generalizes Gaussian prior to Student-t prior.

03

Performance demonstrated on standard datasets.

Abstract

In this paper, we propose a new discriminative model named \emph{nonextensive information theoretical machine (NITM)} based on nonextensive generalization of Shannon information theory. In NITM, weight parameters are treated as random variables. Tsallis divergence is used to regularize the distribution of weight parameters and maximum unnormalized Tsallis entropy distribution is used to evaluate fitting effect. On the one hand, it is showed that some well-known margin-based loss functions such as $ℓ_{0/1}$ loss, hinge loss, squared hinge loss and exponential loss can be unified by unnormalized Tsallis entropy. On the other hand, Gaussian prior regularization is generalized to Student-t prior regularization with similar computational complexity. The model can be solved efficiently by gradient-based convex optimization and its performance is illustrated on standard datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Mechanics and Entropy · Advanced Statistical Methods and Models · Face and Expression Recognition