SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture   Search via Large Language Models

Zicheng Cai; Yaohua Tang; Yutao Lai; Hua Wang; Zhi Chen; Hao Chen

arXiv:2502.20422·cs.CL·March 3, 2025

SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models

Zicheng Cai, Yaohua Tang, Yutao Lai, Hua Wang, Zhi Chen, Hao Chen

PDF

TL;DR

SEKI is a novel LLM-based neural architecture search method that combines self-evolution and knowledge distillation to efficiently discover high-performance neural architectures without domain-specific data.

Contribution

It introduces a two-stage NAS framework leveraging LLMs, achieving state-of-the-art results with minimal computational resources and no domain-specific data.

Findings

01

Achieves SOTA performance across multiple datasets and search spaces.

02

Requires only 0.05 GPU-days for architecture search.

03

Demonstrates strong generalization across various tasks.

Abstract

We introduce SEKI, a novel large language model (LLM)-based neural architecture search (NAS) method. Inspired by the chain-of-thought (CoT) paradigm in modern LLMs, SEKI operates in two key stages: self-evolution and knowledge distillation. In the self-evolution stage, LLMs initially lack sufficient reference examples, so we implement an iterative refinement mechanism that enhances architectures based on performance feedback. Over time, this process accumulates a repository of high-performance architectures. In the knowledge distillation stage, LLMs analyze common patterns among these architectures to generate new, optimized designs. Combining these two stages, SEKI greatly leverages the capacity of LLMs on NAS and without requiring any domain-specific data. Experimental results show that SEKI achieves state-of-the-art (SOTA) performance across various datasets and search spaces while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation