# Differential Description Length for Hyperparameter Selection in Machine   Learning

**Authors:** Mojtaba Abolfazli, Anders Host-Madsen, June Zhang

arXiv: 1902.04699 · 2019-05-23

## TL;DR

This paper proposes Differential Description Length (DDL), a novel method for hyperparameter selection in machine learning that predicts generalization error from training data, outperforming traditional methods like cross-validation and MDL.

## Contribution

Introduces DDL, a new approach linking description length differences to generalization error, applicable to various models including neural networks.

## Key findings

- DDL predicts generalization error effectively from training data.
- DDL achieves smaller generalization error than cross-validation.
- DDL outperforms traditional MDL and Bayes methods in experiments.

## Abstract

This paper introduces a new method for model selection and more generally hyperparameter selection in machine learning. Minimum description length (MDL) is an established method for model selection, which is however not directly aimed at minimizing generalization error, which is often the primary goal in machine learning. The paper demonstrates a relationship between generalization error and a difference of description lengths of the training data; we call this difference differential description length (DDL). This allows prediction of generalization error from the training data alone by performing encoding of the training data. DDL can then be used for model selection by choosing the model with the smallest predicted generalization error. We show how this method can be used for linear regression and neural networks and deep learning. Experimental results show that DDL leads to smaller generalization error than cross-validation and traditional MDL and Bayes methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.04699/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/1902.04699/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1902.04699/full.md

---
Source: https://tomesphere.com/paper/1902.04699