CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following
Kaiyan Zhang, Jianyu Wang, Ermo Hua, Biqing Qi, Ning Ding, Bowen Zhou

TL;DR
CoGenesis is a collaborative framework that combines large cloud-based and small local language models to enhance privacy and efficiency in context-aware instruction following.
Contribution
The paper introduces a novel collaborative generation framework integrating large and small models to address privacy concerns in language model deployment.
Findings
Large models excel with user context but struggle without it.
Small models fine-tuned on synthetic data are promising but less capable than large models.
CoGenesis with mixed models offers competitive performance and privacy benefits.
Abstract
With the advancement of language models (LMs), their exposure to private data is increasingly inevitable, and their deployment (especially for smaller ones) on personal devices, such as PCs and smartphones, has become a prevailing trend. In contexts laden with user information, enabling models to both safeguard user privacy and execute commands efficiently emerges as an essential research imperative. In this paper, we propose CoGenesis, a collaborative generation framework integrating large (hosted on cloud infrastructure) and small models (deployed on local devices) to address privacy concerns logically. Initially, we design a pipeline to create personalized writing instruction datasets enriched with extensive context details as the testbed of this research issue. Subsequently, we introduce two variants of CoGenesis based on sketch and logits respectively. Our experimental findings,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Malware Detection Techniques · Digital and Cyber Forensics · Context-Aware Activity Recognition Systems
