Generative Feature Training of Thin 2-Layer Networks

Johannes Hertrich; Sebastian Neumayer

arXiv:2411.06848·cs.LG·August 14, 2025

Generative Feature Training of Thin 2-Layer Networks

Johannes Hertrich, Sebastian Neumayer

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method for training thin 2-layer neural networks by leveraging a learned generative model for initialization, combined with gradient refinement, to improve approximation quality on small datasets.

Contribution

The paper proposes a generative feature training approach that addresses local minima issues in non-convex optimization for small neural networks.

Findings

01

Effective initialization via learned generative models improves training outcomes.

02

Gradient-based post-processing enhances approximation accuracy.

03

Numerical examples demonstrate the method's practical benefits.

Abstract

We consider the approximation of functions by 2-layer neural networks with a small number of hidden weights based on the squared loss and small datasets. Due to the highly non-convex energy landscape, gradient-based training often suffers from local minima. As a remedy, we initialize the hidden weights with samples from a learned proposal distribution, which we parameterize as a deep generative model. To train this model, we exploit the fact that with fixed hidden weights, the optimal output weights solve a linear equation. After learning the generative model, we refine the sampled weights with a gradient-based post-processing in the latent space. Here, we also include a regularization scheme to counteract potential noise. Finally, we demonstrate the effectiveness of our approach by numerical examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

johertrich/generative_feature_training
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Machine Learning and ELM · Advanced Computing and Algorithms