Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs
Santosh Premi Adhikari, Radu Timofte, Dmitry Ignatov

TL;DR
This paper introduces Delta-Code Generation, a delta-based approach for neural architecture search that refines models via code diffs, significantly reducing code length and computational cost compared to full model synthesis.
Contribution
It presents a novel delta-based fine-tuning pipeline for LLMs to generate compact architecture refinements, improving efficiency and diversity in neural architecture search.
Findings
Delta-based generation surpasses full-generation baseline in validity and accuracy.
Reduces output length by 75-85%, saving computational resources.
Achieves high first-epoch accuracy, demonstrating effective architecture refinement.
Abstract
Large language models (LLMs) show strong potential for neural architecture generation, yet existing approaches produce complete model implementations from scratch -- computationally expensive and yielding verbose code. We propose Delta-Code Generation, where fine-tuned LLMs generate compact unified diffs (deltas) to refine baseline architectures rather than synthesizing entire models. Our pipeline iteratively fine-tunes the LLM via LoRA on curated architectures from the LEMUR dataset, with MinHash-Jaccard novelty filtering for structural diversity. We evaluate three 7B-class LLMs -- DeepSeek-Coder-7B, Qwen2.5-Coder-7B, and Mistral-7B -- across six datasets (CIFAR-10, CIFAR-100, MNIST, SVHN, ImageNette, CelebA) using a 22-cycle protocol (1,100 candidates per LLM). All three substantially surpass the full-generation baseline (50.6% valid rate, 42.3% mean first-epoch accuracy):…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
