ChameleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time   Clusters

Kamer Ali Yuksel; Hassan Sawaf

arXiv:2502.04315·cs.CL·February 12, 2025

ChameleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters

Kamer Ali Yuksel, Hassan Sawaf

PDF

Open Access 1 Repo

TL;DR

ChameleonLLM introduces a dynamic, inference-time adaptation framework for large language models that uses batch-aware clustering and low-rank updates to improve performance without additional model maintenance.

Contribution

It proposes a novel method for real-time LLM adaptation using clustering and hyper-networks, outperforming traditional fine-tuning approaches like LoRA.

Findings

01

Outperforms conventional LoRA methods in experiments.

02

Eliminates the need for multiple expert models.

03

Provides a versatile, adaptive inference solution.

Abstract

Recent advances in large language models (LLMs) have shown remarkable performance across diverse tasks. However, these models are typically deployed with fixed weights, which limits their ability to adapt dynamically to the variability inherent in real-world data during inference. This paper introduces ChameleonLLM, a novel framework that enables inference-time adaptation of LLMs by leveraging batch-aware clustering and on-the-fly generation of low-rank updates. Unlike traditional fine-tuning approaches such as Low-Rank Adaptation (LoRA) or methods that rely on a fixed set of pre-learned uniforms (changeable masks), our method dynamically generates adaptive modifications to the decoder weights based on the aggregated statistics of clustered batches. By intelligently grouping similar inputs and computing context-aware low-rank updates via a hyper-network, ChameleonLLM achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kayuksel/ChamaleonLLM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning

MethodsSparse Evolutionary Training