Flows and Diffusions on the Neural Manifold
Daniel Saragih, Deyu Cao, Tejas Balaji

TL;DR
This paper introduces a theoretical framework for modeling optimization trajectories as flows on the neural manifold, enhancing weight space learning and enabling applications like improved initialization and covariate shift detection.
Contribution
It unifies trajectory inference techniques for gradient flows, integrating structural priors into weight space modeling with novel architectural and algorithmic strategies.
Findings
Outperforms baselines in generating in-distribution weights
Improves initialization for downstream training
Effective in detecting harmful covariate shifts
Abstract
Diffusion and flow-based generative models have achieved remarkable success in domains such as image synthesis, video generation, and natural language modeling. In this work, we extend these advances to weight space learning by leveraging recent techniques to incorporate structural priors derived from optimization dynamics. Central to our approach is modeling the trajectory induced by gradient descent as a trajectory inference problem. We unify several trajectory inference techniques towards matching a gradient flow, providing a theoretical framework for treating optimization paths as inductive bias. We further explore architectural and algorithmic choices, including reward fine-tuning by adjoint matching, the use of autoencoders for latent weight representation, conditioning on task-specific context data, and adopting informative source distributions such as Kaiming uniform.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
