# Discrete Infomax Codes for Supervised Representation Learning

**Authors:** Yoonho Lee, Wonjae Kim, Wonpyo Park, Seungjin Choi

arXiv: 1905.11656 · 2020-02-25

## TL;DR

This paper introduces Discrete InfoMax Codes (DIMCO), a model that learns compact, discrete representations by maximizing mutual information with labels, reducing overfitting and improving efficiency in data processing.

## Contribution

The paper proposes a novel probabilistic encoder for discrete codes that maximizes mutual information with labels and demonstrates its benefits for overfitting reduction and efficiency.

## Key findings

- Shorter codes reduce overfitting in few-shot classification.
- DIMCO codes are memory and retrieval efficient.
- The model generalizes previous loss functions like cross-entropy.

## Abstract

Learning compact discrete representations of data is a key task on its own or for facilitating subsequent processing of data. In this paper we present a model that produces Discrete InfoMax Codes (DIMCO); we learn a probabilistic encoder that yields k-way d-dimensional codes associated with input data. Our model's learning objective is to maximize the mutual information between codes and labels with a regularization, which enforces entries of a codeword to be as independent as possible. We show that the infomax principle also justifies previous loss functions (e.g., cross-entropy) as its special cases. Our analysis also shows that using shorter codes, as DIMCO does, reduces overfitting in the context of few-shot classification. Through experiments in various domains, we observe this implicit meta-regularization effect of DIMCO. Furthermore, we show that the codes learned by DIMCO are efficient in terms of both memory and retrieval time compared to previous methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.11656/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1905.11656/full.md

## References

59 references — full list in the complete paper: https://tomesphere.com/paper/1905.11656/full.md

---
Source: https://tomesphere.com/paper/1905.11656