ALAS: Autonomous Learning Agent for Self-Updating Language Models
Dhruv Atreja

TL;DR
ALAS is a modular system that autonomously updates language models with recent information, significantly improving their accuracy on evolving topics through continuous learning and minimal human intervention.
Contribution
It introduces a novel autonomous pipeline for continual LLM knowledge updating, combining curriculum generation, web retrieval, data distillation, and iterative fine-tuning.
Findings
Achieves up to 90% accuracy on knowledge-updated queries
Reduces manual dataset curation efforts
Demonstrates effective continual learning in dynamic domains
Abstract
Large language models (LLMs) often have a fixed knowledge cutoff, limiting their accuracy on emerging information. We present ALAS (Autonomous Learning Agent System), a modular pipeline that continuously updates an LLM's knowledge with minimal human intervention. ALAS autonomously generates a learning curriculum for a target domain, retrieves up-to-date information from the web (with citations), distills this into question-answer training data, and fine-tunes the model through supervised fine-tuning (SFT) and direct preference optimization (DPO). It iteratively evaluates performance and revises the curriculum, enabling long-term continual learning. We demonstrate ALAS's ability to self-improve a model on rapidly evolving domains (e.g., new Python releases, latest security CVEs, academic trends), significantly boosting post-cutoff question answering accuracy (from 15% to 90% on average)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
