Latent Space Communication via K-V Cache Alignment

Lucio M. Dery; Zohar Yahav; Henry Prior; Qixuan Feng; Jiajun Shen; Arthur Szlam

arXiv:2601.06123·cs.LG·January 13, 2026

Latent Space Communication via K-V Cache Alignment

Lucio M. Dery, Zohar Yahav, Henry Prior, Qixuan Feng, Jiajun Shen, Arthur Szlam

PDF

Open Access

TL;DR

This paper introduces a shared latent space for multi-model communication by aligning key-value caches, enabling models to collaborate more effectively without retraining, and transferring skills directly between models.

Contribution

It proposes a novel shared representation space aligning models' internal caches, facilitating direct communication and skill transfer without modifying pre-trained models.

Findings

01

Enhanced inter-model communication demonstrated with Gemma-2 models

02

Improved individual model performance through shared space

03

Successful transfer of learned skills like soft prompts

Abstract

Solving increasingly complex problems with large language models (LLMs) necessitates a move beyond individual models and towards multi-model systems that can effectively collaborate. While text has traditionally served as the medium for inter-model communication, a richer and more efficient exchange is possible if models can access each other's internal states directly. In this paper, we propose learning a shared representation space that aligns the k-v caches of multiple models, creating a high-bandwidth channel for collaboration without altering the underlying pre-trained parameters. We do so by augmenting each model with adapters to translate its state into and out of this shared space. Via a suite of experiments with Gemma-2 models, we demonstrate that this approach not only enables seamless inter-model communication but also improves individual model performance. We also show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare