Assessing Information Transmission in Data Transformations with the   Channel Multivariate Entropy Triangle

Francisco J. Valverde-Albacete; Carmen Pel\'aez-Moreno

arXiv:1711.11510·cs.IT·October 11, 2018

Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

Francisco J. Valverde-Albacete, Carmen Pel\'aez-Moreno

PDF

1 Repo

TL;DR

This paper introduces an information-theoretic framework and visual tools, including the Channel Multivariate Entropy Triangle, to evaluate how effectively data transformations transfer information in machine learning tasks.

Contribution

It presents a novel entropy decomposition, balance equations, and the Channel Multivariate Entropy Triangle for assessing data transformation quality in an unsupervised manner.

Findings

01

Effective visualization of information transfer in data transformations.

02

Application of tools to PCA and ICA shows their information transfer efficiency.

03

Decomposition reveals non-transferable and transferable information components.

Abstract

Data transformation, e.g. feature transformation and selection, is an integral part of any machine learning procedure. In this paper we introduce an information-theoretic model and tools to assess the quality of data transformations in machine learning tasks. In an unsupervised fashion, we analyze the transfer of information of the transformation of a discrete, multivariate source of information X into a discrete, multivariate sink of information Y related by a distribution PXY . The first contribution is a decomposition of the maximal potential entropy of (X, Y) that we call a balance equation, into its a) non-transferable, b) transferable but not transferred and c) transferred parts. Such balance equations can be represented in (de Finetti) entropy diagrams, our second set of contributions. The most important of these, the aggregate Channel Multivariate Entropy Triangle is a visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

FJValverde/entropies
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsIndependent Component Analysis · Principal Components Analysis