# A Look at the Effect of Sample Design on Generalization through the Lens   of Spectral Analysis

**Authors:** Bhavya Kailkhura, Jayaraman J. Thiagarajan, Qunwei Li, Peer-Timo, Bremer

arXiv: 1906.02732 · 2019-06-11

## TL;DR

This paper introduces a spectral analysis framework to understand how sampling patterns influence the generalization error of machine learning models, linking geometric properties to spectral forms and providing error bounds.

## Contribution

It develops a novel spectral analysis approach in Euclidean space that connects sampling geometry with generalization performance, offering insights for designing optimal sampling strategies.

## Key findings

- Spectral properties of sampling patterns affect generalization error.
- Error bounds and convergence rates are derived for various sampling methods.
- Insights are provided that are independent of specific learning architectures.

## Abstract

This paper provides a general framework to study the effect of sampling properties of training data on the generalization error of the learned machine learning (ML) models. Specifically, we propose a new spectral analysis of the generalization error, expressed in terms of the power spectra of the sampling pattern and the function involved. The framework is build in the Euclidean space using Fourier analysis and establishes a connection between some high dimensional geometric objects and optimal spectral form of different state-of-the-art sampling patterns. Subsequently, we estimate the expected error bounds and convergence rate of different state-of-the-art sampling patterns, as the number of samples and dimensions increase. We make several observations about generalization error which are valid irrespective of the approximation scheme (or learning architecture) and training (or optimization) algorithms. Our result also sheds light on ways to formulate design principles for constructing optimal sampling methods for particular problems.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.02732/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1906.02732/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1906.02732/full.md

---
Source: https://tomesphere.com/paper/1906.02732