# Learning by Active Nonlinear Diffusion

**Authors:** Mauro Maggioni, James M. Murphy

arXiv: 1905.12989 · 2019-05-31

## TL;DR

This paper introduces an active learning approach leveraging diffusion processes on graphs to effectively label high-dimensional data with complex geometries, providing theoretical guarantees and demonstrating strong empirical results.

## Contribution

It presents a novel active learning method based on diffusion distances that handles nonlinear geometries and noise, with proven theoretical performance and efficient implementation.

## Key findings

- The method achieves high-accuracy labeling with few labels.
- It has theoretical guarantees under complex data models.
- Demonstrates competitive results on real hyperspectral images.

## Abstract

This article proposes an active learning method for high dimensional data, based on intrinsic data geometries learned through diffusion processes on graphs. Diffusion distances are used to parametrize low-dimensional structures on the dataset, which allow for high-accuracy labelings of the dataset with only a small number of carefully chosen labels. The geometric structure of the data suggests regions that have homogeneous labels, as well as regions with high label complexity that should be queried for labels. The proposed method enjoys theoretical performance guarantees on a general geometric data model, in which clusters corresponding to semantically meaningful classes are permitted to have nonlinear geometries, high ambient dimensionality, and suffer from significant noise and outlier corruption. The proposed algorithm is implemented in a manner that is quasilinear in the number of unlabeled data points, and exhibits competitive empirical performance on synthetic datasets and real hyperspectral remote sensing images.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.12989/full.md

## Figures

41 figures with captions in the complete paper: https://tomesphere.com/paper/1905.12989/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/1905.12989/full.md

---
Source: https://tomesphere.com/paper/1905.12989