STAR: Synthesis of Tailored Architectures
Armin W. Thomas, Rom Parnichkun, Alexander Amini, Stefano Massaroli, and Michael Poli

TL;DR
STAR introduces a novel evolutionary approach for synthesizing tailored neural network architectures by leveraging a new search space based on linear input-varying systems theory, enabling efficient multi-metric optimization.
Contribution
The paper presents a new architecture synthesis method combining a theoretically grounded search space with evolutionary algorithms for multi-objective optimization.
Findings
Outperforms existing Transformer models in quality and efficiency.
Enables optimization of diverse architectures with various computational units.
Achieves state-of-the-art results in autoregressive language modeling.
Abstract
Iterative improvement of model architectures is fundamental to deep learning: Transformers first enabled scaling, and recent advances in model hybridization have pushed the quality-efficiency frontier. However, optimizing architectures remains challenging and expensive. Current automated or manual approaches fall short, largely due to limited progress in the design of search spaces and due to the simplicity of resulting patterns and heuristics. In this work, we propose a new approach for the synthesis of tailored architectures (STAR). Our approach combines a novel search space based on the theory of linear input-varying systems, supporting a hierarchical numerical encoding into architecture genomes. STAR genomes are automatically refined and recombined with gradient-free, evolutionary algorithms to optimize for multiple model quality and efficiency metrics. Using STAR, we optimize large…
Peer Reviews
Decision·ICLR 2025 Oral
--It shows promising results. -The encoding for architecture genome seems to be novel.
-The idea of using evolutionary algorithm for the architecture design space exploration is not new. -The mutation mainly happens on the backbone and it is not clear if will affect the underlying components.
The work proposes a straightforward method for constructing a well-conditioned and comprehensive model architecture search space that is amenable to evolutionary optimization under multiple objectives. The hierarchical construction of the LIV building blocks and the coding schemes ensures that a wide variety of candidate architectures can be expressed and searched efficiently. Notably, the genome representation has a clear interpretation which makes it easy to apply constraints to ensure robustn
By construction, the method appears to be geared towards finding tweaks of transformer-based architecture stacks and does not aim to discover fundamentally new architecture designs. For instance, genomes are constructed assuming a pre-norm residual structure that cannot be varied by the optimizer. This is a reasonable assumption but it arguably relies on the manual design expertise that automated architecture search is ultimately aiming to overcome. Can you think of ways to extend the genome to
- The paper presents an important research problem for the ML design automation community: The limitations of existing search spaces, especially for hybrid and transformer models in language processing applications, which typically require significant computational resources to evaluate different design choices. - The hierarchical search space design is grounded in the theory of linear input-varying systems, providing a solid theoretical foundation. Characterizing neural network architectures th
- While Section 2 provides a thorough overview of the hierarchical search space, certain design choices would benefit from additional explanation, particularly in relation to existing implementations and empirical justification for their inclusion. For example, the distinction between channel and token mixing could be clarified, along with an explanation of the LIVs to which each can be applied. Since STAR operates within a hybrid design space that may incorporate either convolution or attention
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArchitecture and Computational Design · BIM and Construction Integration
