# Topology-based Representative Datasets to Reduce Neural Network Training   Resources

**Authors:** Rocio Gonzalez-Diaz, Miguel A. Guti\'errez-Naranjo, Eduardo, Paluzo-Hidalgo

arXiv: 1903.08519 · 2024-03-14

## TL;DR

This paper introduces a topology-based method for creating smaller, representative datasets that maintain neural network training accuracy, significantly reducing training time.

## Contribution

It proposes a novel approach using persistence diagrams to identify representative datasets, backed by theoretical proofs and experimental validation.

## Key findings

- Representative datasets achieve similar accuracy to original datasets for perceptrons.
- Using topology reduces dataset size without sacrificing training quality.
- Method accelerates neural network training by dataset reduction.

## Abstract

One of the main drawbacks of the practical use of neural networks is the long time required in the training process. Such a training process consists of an iterative change of parameters trying to minimize a loss function. These changes are driven by a dataset, which can be seen as a set of labelled points in an n-dimensional space. In this paper, we explore the concept of are representative dataset which is a dataset smaller than the original one, satisfying a nearness condition independent of isometric transformations. Representativeness is measured using persistence diagrams (a computational topology tool) due to its computational efficiency. We prove that the accuracy of the learning process of a neural network on a representative dataset is "similar" to the accuracy on the original dataset when the neural network architecture is a perceptron and the loss function is the mean squared error. These theoretical results accompanied by experimentation open a door to reducing the size of the dataset to gain time in the training process of any neural network.

## Figures

40 figures with captions in the complete paper: https://tomesphere.com/paper/1903.08519/full.md

---
Source: https://tomesphere.com/paper/1903.08519