NeuroGen: Neural Network Parameter Generation via Large Language Models

Jiaqi Wang; Yusen Zhang; Xi Li

arXiv:2505.12470·cs.AI·May 26, 2025

NeuroGen: Neural Network Parameter Generation via Large Language Models

Jiaqi Wang, Yusen Zhang, Xi Li

PDF

Open Access

TL;DR

NeuroGen introduces a novel method where large language models generate neural network parameters conditioned on task descriptions, offering a new paradigm beyond traditional iterative optimization methods.

Contribution

The paper presents a two-stage approach for neural network parameter generation using large language models, including parameter knowledge injection and task-aware instruction tuning.

Findings

01

NeuroGen can generate usable neural network parameters.

02

The approach demonstrates the feasibility of LLM-based parameter generation.

03

Results suggest a new paradigm for neural network training and design.

Abstract

Acquiring the parameters of neural networks (NNs) has been one of the most important problems in machine learning since the inception of NNs. Traditional approaches, such as backpropagation and forward-only optimization, acquire parameters via iterative data fitting to gradually optimize them. This paper aims to explore the feasibility of a new direction: acquiring NN parameters via large language model generation. We propose NeuroGen, a generalized and easy-to-implement two-stage approach for NN parameter generation conditioned on descriptions of the data, task, and network architecture. Stage one is Parameter Reference Knowledge Injection, where LLMs are pretrained on NN checkpoints to build foundational understanding of parameter space, whereas stage two is Context-Enhanced Instruction Tuning, enabling LLMs to adapt to specific tasks through enriched, task-aware prompts. Experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Topic Modeling