Search over Self-Edit Strategies for LLM Adaptation
Alistair Cheong, Haolin Cong, Tyler Yang, Dustin Miao

TL;DR
This paper explores enabling LLMs to autonomously decide how to update their weights using task feedback, focusing on self-edit strategies within a test-time adaptation framework, and analyzing their effectiveness and limitations.
Contribution
It introduces a method for LLMs to generate self-edit templates for weight updates, relaxing fixed templates and studying the impact of archive-based guidance on self-improvement.
Findings
Archive guidance improves self-edit performance but can cause homogenization.
Naive archive provides short-term robustness but may hinder long-term progress.
Explicit novelty pressure is needed for sustained advancement beyond human strategies.
Abstract
Many LLM-based open-ended search systems freeze the foundation model that proposes improvements to existing solutions, which may bottleneck long-run progress. Recent work has explored updating the proposal model at test time [arXiv:2511.23473], but the update strategy is still typically hand-specified. Therefore, this study investigated whether an LLM can use task feedback to decide how it should update its weights. For tractability, we focused on the simpler case where there is only one round of self-improvement, and restricted the update operator to self-supervised next token prediction (NTP), leaving the model freedom in choosing its training data and key NTP hyperparameters. Using the Self-Adapting Language Models (SEAL) [arXiv:2506.10943] framework as a testbed, we relaxed its fixed human template constraint and allowed the model to generate its own self-edit templates, thereby…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Information Retrieval and Search Behavior · Topic Modeling
