Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of   Stealing Privacy

Zhenyuan Guo; Yi Shi; Wenlong Meng; Chen Gong; Chengkun Wei; Wenzhi; Chen

arXiv:2502.11533·cs.CL·February 18, 2025

Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy

Zhenyuan Guo, Yi Shi, Wenlong Meng, Chen Gong, Chengkun Wei, Wenzhi, Chen

PDF

Open Access 1 Repo

TL;DR

This paper uncovers a privacy risk in merging large language models, demonstrating how malicious models can steal personal data after merging, and proposes a method to conceal such attacks.

Contribution

It introduces PhiMM, a novel privacy attack method for LLM merging, and a cloaking technique to hide malicious intent, highlighting new security concerns in model merging.

Findings

01

Merging phishing models increases privacy breach risks.

02

PII leakage increased by 3.9% after merging.

03

MI leakage increased by 17.4% after merging.

Abstract

Model merging is a widespread technology in large language models (LLMs) that integrates multiple task-specific LLMs into a unified one, enabling the merged model to inherit the specialized capabilities of these LLMs. Most task-specific LLMs are sourced from open-source communities and have not undergone rigorous auditing, potentially imposing risks in model merging. This paper highlights an overlooked privacy risk: \textit{an unsafe model could compromise the privacy of other LLMs involved in the model merging.} Specifically, we propose PhiMM, a privacy attack approach that trains a phishing model capable of stealing privacy using a crafted privacy phishing instruction dataset. Furthermore, we introduce a novel model cloaking method that mimics a specialized capability to conceal attack intent, luring users into merging the phishing model. Once victims merge the phishing model, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guozhenyuan/phimm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Privacy, Security, and Data Protection · Cryptography and Data Security