# Direct estimation of density functionals using a polynomial basis

**Authors:** Alan Wisler, Visar Berisha, Andreas Spanias, Alfred O. Hero

arXiv: 1702.06516 · 2018-02-14

## TL;DR

This paper introduces a novel data-driven basis approach for directly estimating density functionals from samples, avoiding distribution fitting and high-dimensional integration, with applications to divergence measures and error bounds.

## Contribution

It proposes a new complete basis for approximating density functionals directly from samples, eliminating the need for distribution assumptions or explicit density estimation.

## Key findings

- Accurate estimation of divergence functions from samples.
- Effective approximation of density functionals using the new basis.
- Empirical bounds on Bayes error rate demonstrated.

## Abstract

A number of fundamental quantities in statistical signal processing and information theory can be expressed as integral functions of two probability density functions. Such quantities are called density functionals as they map density functions onto the real line. For example, information divergence functions measure the dissimilarity between two probability density functions and are useful in a number of applications. Typically, estimating these quantities requires complete knowledge of the underlying distribution followed by multi-dimensional integration. Existing methods make parametric assumptions about the data distribution or use non-parametric density estimation followed by high-dimensional integration. In this paper, we propose a new alternative. We introduce the concept of "data-driven basis functions" - functions of distributions whose value we can estimate given only samples from the underlying distributions without requiring distribution fitting or direct integration. We derive a new data-driven complete basis that is similar to the deterministic Bernstein polynomial basis and develop two methods for performing basis expansions of functionals of two distributions. We also show that the new basis set allows us to approximate functions of distributions as closely as desired. Finally, we evaluate the methodology by developing data driven estimators for the Kullback-Leibler divergences and the Hellinger distance and by constructing empirical estimates of tight bounds on the Bayes error rate.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.06516/full.md

## Figures

19 figures with captions in the complete paper: https://tomesphere.com/paper/1702.06516/full.md

## References

59 references — full list in the complete paper: https://tomesphere.com/paper/1702.06516/full.md

---
Source: https://tomesphere.com/paper/1702.06516