NNGPT: Rethinking AutoML with Large Language Models

Roman Kochnev; Waleed Khalid; Tolgay Atinc Uzun; Xi Zhang; Yashkumar Sanjaybhai Dhameliya; Furui Qin; Chandini Vysyaraju; Raghuvir Duvvuri; Avi Goyal; Dmitry Ignatov; Radu Timofte

arXiv:2511.20333·cs.AI·November 26, 2025

NNGPT: Rethinking AutoML with Large Language Models

Roman Kochnev, Waleed Khalid, Tolgay Atinc Uzun, Xi Zhang, Yashkumar Sanjaybhai Dhameliya, Furui Qin, Chandini Vysyaraju, Raghuvir Duvvuri, Avi Goyal, Dmitry Ignatov, Radu Timofte

PDF

Open Access

TL;DR

NNGPT introduces a novel open-source AutoML framework leveraging large language models to autonomously generate, assess, and improve neural network architectures and hyperparameters, primarily for computer vision tasks.

Contribution

It is the first framework to integrate five LLM-based pipelines into a unified, self-improving AutoML system that continuously refines neural network models.

Findings

01

Achieves 73% executability on 1,289 targets

02

Reduces trial numbers with one-shot prediction matching search-based AutoML

03

Outperforms Optuna in hyperparameter optimization RMSE

Abstract

Building self-improving AI systems remains a fundamental challenge in the AI domain. We present NNGPT, an open-source framework that turns a large language model (LLM) into a self-improving AutoML engine for neural network development, primarily for computer vision. Unlike previous frameworks, NNGPT extends the dataset of neural networks by generating new models, enabling continuous fine-tuning of LLMs based on closed-loop system of generation, assessment, and self-improvement. It integrates within one unified workflow five synergistic LLM-based pipelines: zero-shot architecture synthesis, hyperparameter optimization (HPO), code-aware accuracy/early-stop prediction, retrieval-augmented synthesis of scope-closed PyTorch blocks (NN-RAG), and reinforcement learning. Built on the LEMUR dataset as an audited corpus with reproducible metrics, NNGPT emits from a single prompt and validates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications