Analysis and Explainability of LLMs Via Evolutionary Methods
Shannon K. Gallagher, Swati Rallapalli, Tyler Brooks, Chuck Loughin, Michele Sezgin, Ronald Yurko

TL;DR
This paper applies evolutionary methods to large language models to analyze their relationships, model lineage, and dataset importance, providing visualizations and experiments that reveal model evolution and key components.
Contribution
It extends evolutionary analysis techniques to neural networks, specifically LLMs, to improve understanding of model lineage, dataset influence, and model relationships.
Findings
Evolutionary trees accurately recover ground-truth training topology.
Identified key weight layers contributing to model differences.
Unsupervised evolutionary tree of black-box models created.
Abstract
Evolutionary methods have long been useful for analysis and explanation in genetics, biology, ecology, and related fields. In this work, we extend these methods to neural networks, specifically large language models (LLMs), to better analyze and explain relationships among models. We show how relating weights to genotypes and output text to phenotypes can improve our understanding of model lineage, important datasets, the roles of different model layers, and visualization of model relationships. We demonstrate this in a controlled experiment, where our estimated evolutionary trees reliably recover the topology of the ground-truth training tree. We further identify the most important weight layers according to weight differences and show through phenotypic experiments that one training dataset appears to contribute more useful information than the others. Finally, we generate an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
