Probability Distribution Learning and Its Application in Deep Learning

Binchuan Qi; Wei Gong; Li Li

arXiv:2406.05666·cs.LG·October 9, 2025

Probability Distribution Learning and Its Application in Deep Learning

Binchuan Qi, Wei Gong, Li Li

PDF

Open Access

TL;DR

This paper introduces a probability distribution learning framework to analyze deep learning's optimization and generalization, providing theoretical guarantees and insights into neural network training.

Contribution

It proposes the PD learning framework, establishes the Fenchel-Young loss as optimal, and introduces new concepts to explain SGD effectiveness and generalization bounds.

Findings

01

Fenchel-Young loss is necessary for PD learning.

02

Introduces $ ext{H}(\psi)$-convexity and $ ext{H}(\Psi)$-smoothness for DNNs.

03

Provides bounds on risk and generalization error influenced by training set size and information loss.

Abstract

Despite its empirical success, deep learning still lacks a comprehensive theoretical understanding of model fitting and generalization. This paper proposes the probability distribution (PD) learning framework to analyze the optimization and generalization mechanisms of deep learning. Within this framework, the conditional distribution of labels given features is the primary learning target, with the loss function, prior knowledge, and model properties explicitly characterized. Under these formulations, we establish theoretical guarantees on optimizability, even in non-convex settings, and derive generalization error bounds that provide meaningful explanations for practical performance. Specifically, we first prove theoretically that the Fenchel-Young loss is the natural and necessary choice for solving PD learning problems, thereby justifying the generality of conclusions based on this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsDropout