Single-Solution Hypervolume Maximization and its use for Improving   Generalization of Neural Networks

Conrado S. Miranda; Fernando J. Von Zuben

arXiv:1602.01164·cs.LG·February 4, 2016·1 cites

Single-Solution Hypervolume Maximization and its use for Improving Generalization of Neural Networks

Conrado S. Miranda, Fernando J. Von Zuben

PDF

Open Access

TL;DR

This paper proposes a hypervolume maximization approach with a single solution as an alternative to mean loss minimization, offering theoretical insights and empirical validation that can improve neural network generalization.

Contribution

It introduces a hypervolume maximization method with a controllable hyperparameter, bridging the gap between max and mean loss minimization, and demonstrates its effectiveness on MNIST.

Findings

01

Hypervolume maximization can mimic mean loss minimization.

02

Adjusting the hyperparameter influences focus on higher-loss samples.

03

Achieved 20% reduction in test classification error.

Abstract

This paper introduces the hypervolume maximization with a single solution as an alternative to the mean loss minimization. The relationship between the two problems is proved through bounds on the cost function when an optimal solution to one of the problems is evaluated on the other, with a hyperparameter to control the similarity between the two problems. This same hyperparameter allows higher weight to be placed on samples with higher loss when computing the hypervolume's gradient, whose normalized version can range from the mean loss to the max loss. An experiment on MNIST with a neural network is used to validate the theory developed, showing that the hypervolume maximization can behave similarly to the mean loss minimization and can also provide better performance, resulting on a 20% reduction of the classification error on the test set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference