Mixture of Lookup Key-Value Experts

Zongcheng Wang

arXiv:2512.09723·cs.LG·December 11, 2025

Mixture of Lookup Key-Value Experts

Zongcheng Wang

PDF

Open Access

TL;DR

The paper introduces MoLKV, an improved model over MoLE that uses context-aware key-value experts to enhance performance while maintaining suitability for resource-constrained devices.

Contribution

MoLKV extends MoLE by incorporating context-aware expert selection through key-value interactions, improving model accuracy and effectiveness.

Findings

01

MoLKV achieves lower validation loss than MoLE in experiments.

02

Context-aware expert selection improves model performance.

03

MoLKV maintains low communication overhead for resource-limited devices.

Abstract

Recent research has developed several LLM architectures suitable for inference on end-user devices, such as the Mixture of Lookup Experts (MoLE)~\parencite{jie_mixture_2025}. A key feature of MoLE is that each token id is associated with a dedicated group of experts. For a given input, only the experts corresponding to the input token id will be activated. Since the communication overhead of loading this small number of activated experts into RAM during inference is negligible, expert parameters can be offloaded to storage, making MoLE suitable for resource-constrained devices. However, MoLE's context-independent expert selection mechanism, based solely on input ids, may limit model performance. To address this, we propose the \textbf{M}ixture \textbf{o}f \textbf{L}ookup \textbf{K}ey-\textbf{V}alue Experts (\textbf{MoLKV}) model. In MoLKV, each expert is structured as a key-value pair.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Context-Aware Activity Recognition Systems · IoT and Edge/Fog Computing