# Latent-Space Mean-Field Theory for Deep BitNet-like Training: Constrained Gradient Flows with Smooth Quantization and STE Limits

**Authors:** Dongwon Kim, Dongseok Lee

arXiv: 2509.00133 · 2025-09-03

## TL;DR

This paper introduces a mean-field theoretical framework for understanding the training dynamics of deep quantized neural networks, revealing how smooth quantization influences convergence and stability.

## Contribution

It develops a novel mean-field analysis for deep quantized networks, showing convergence of latent weight measures and establishing a foundation for gradient-based training with quantization.

## Key findings

- Empirical measures converge to solutions of constrained continuity equations.
- Exponential decay cancels singularities, ensuring uniform bounds.
- Provides a rigorous mathematical basis for training quantized neural networks.

## Abstract

This work develops a mean-field analysis for the asymptotic behavior of deep BitNet-like architectures as smooth quantization parameters approach zero. We establish that empirical measures of latent weights converge weakly to solutions of constrained continuity equations under vanishing quantization smoothing. Our main theoretical contribution demonstrates that the natural exponential decay in smooth quantization cancels out apparent singularities, yielding uniform bounds on mean-field dynamics independent of smoothing parameters. Under standard regularity assumptions, we prove convergence to a well-defined limit that provides the mathematical foundation for gradient-based training of quantized neural networks through distributional analysis.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00133/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/2509.00133/full.md

---
Source: https://tomesphere.com/paper/2509.00133