NeuroGen: Neural Network Parameter Generation via Large Language Models
Jiaqi Wang, Yusen Zhang, Xi Li

TL;DR
NeuroGen introduces a novel method where large language models generate neural network parameters conditioned on task descriptions, offering a new paradigm beyond traditional iterative optimization methods.
Contribution
The paper presents a two-stage approach for neural network parameter generation using large language models, including parameter knowledge injection and task-aware instruction tuning.
Findings
NeuroGen can generate usable neural network parameters.
The approach demonstrates the feasibility of LLM-based parameter generation.
Results suggest a new paradigm for neural network training and design.
Abstract
Acquiring the parameters of neural networks (NNs) has been one of the most important problems in machine learning since the inception of NNs. Traditional approaches, such as backpropagation and forward-only optimization, acquire parameters via iterative data fitting to gradually optimize them. This paper aims to explore the feasibility of a new direction: acquiring NN parameters via large language model generation. We propose NeuroGen, a generalized and easy-to-implement two-stage approach for NN parameter generation conditioned on descriptions of the data, task, and network architecture. Stage one is Parameter Reference Knowledge Injection, where LLMs are pretrained on NN checkpoints to build foundational understanding of parameter space, whereas stage two is Context-Enhanced Instruction Tuning, enabling LLMs to adapt to specific tasks through enriched, task-aware prompts. Experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Topic Modeling
