TL;DR
This paper explores using large language models to assist neural architecture search by reasoning over architectural code, leading to improved channel-configuration designs in vision models.
Contribution
It introduces a closed-loop LLM framework for channel configuration search, leveraging synthetically generated architectural examples to learn design patterns and improve performance.
Findings
LLM-driven NAS outperforms initial architectures on CIFAR-100
Generated architectures exhibit domain-specific design patterns
Language models can learn executable architectural patterns from synthetic data
Abstract
Channel-configuration search, the optimization of layer specifications such as channel widths in deep neural networks, presents a combinatorial challenge constrained by tensor-shape compatibility and computational budgets. We investigate whether large language models (LLMs) can support neural architecture search (NAS) by reasoning over architectural code structures in ways that complement traditional search heuristics. We apply an LLM-driven NAS framework to channel-configuration search, formulating the task as conditional code generation in which the LLM refines architectural specifications using performance feedback. To address data scarcity, we generate a corpus of valid, shape-consistent architectures through abstract syntax tree (AST) mutations. Although these mutated networks are not necessarily optimized for performance, they provide structural examples that help the LLM learn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
