CollabEdit: Towards Non-destructive Collaborative Knowledge Editing

Jiamu Zheng; Jinghuai Zhang; Tianyu Du; Xuhong Zhang; Jianwei Yin; Tao; Lin

arXiv:2410.09508·cs.CL·February 25, 2025

CollabEdit: Towards Non-destructive Collaborative Knowledge Editing

Jiamu Zheng, Jinghuai Zhang, Tianyu Du, Xuhong Zhang, Jianwei Yin, Tao, Lin

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces COLLABEDIT, a novel framework for collaborative knowledge editing in large language models that preserves model performance and addresses key challenges like knowledge overlap, conflict, and forgetting.

Contribution

It presents the first non-destructive collaborative knowledge editing framework employing a new model merging mechanism for privacy-preserving, continual knowledge updates in LLMs.

Findings

01

COLLABEDIT outperforms destructive baselines in experiments.

02

Addresses knowledge overlap, conflict, and forgetting effectively.

03

Demonstrates potential for privacy-preserving collaborative LLM updates.

Abstract

Collaborative learning of large language models (LLMs) has emerged as a new paradigm for utilizing private data from different parties to guarantee efficiency and privacy. Meanwhile, Knowledge Editing (KE) for LLMs has also garnered increased attention due to its ability to manipulate the behaviors of LLMs explicitly, yet leaves the collaborative KE case (in which knowledge edits of multiple parties are aggregated in a privacy-preserving and continual manner) unexamined. To this end, this manuscript dives into the first investigation of collaborative KE, in which we start by carefully identifying the unique three challenges therein, including knowledge overlap, knowledge conflict, and knowledge forgetting. We then propose a non-destructive collaborative KE framework, COLLABEDIT, which employs a novel model merging mechanism to mimic the global KE behavior while preventing the severe…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 3

Strengths

COLLABEDIT allows for non-destructive knowledge editing, which prevents significant performance drops that are common in traditional methods The framework is versatile and can integrate existing knowledge editing methods, providing a comprehensive solution to collaborative KE challenges Empirical results show that COLLABEDIT outperforms existing destructive baselines, demonstrating superior editing performance even with a large number of edits

Weaknesses

The non-destructive merging mechanism may introduce additional complexity in implementation compared to simpler, traditional methods. Its scalability in large collaborative environments or with numerous clients may need further exploration. More experiments on different LLMs could benefit the demonstration of the effectiveness of the proposed method.

Reviewer 02Rating 6Confidence 3

Strengths

+ The paper tackles an important problem of generalizing knowledge editing to collaborative learning settings where privacy is a critical concern. + The authors provide a compelling theoretical analysis of the limitations of naive weight sharing and introduce the concept of sharing $KK^{T}$, which is proved to be difficult to attack in the traditional privacy-aware setting. + The experiments conducted seem to effectively demonstrate the effectiveness of the proposed method.

Weaknesses

- It is not surprising to see the destructive performance of direct fed-average for knowledge editing, as edits individual client are naturally diluted when models are averaged, although I appreciate the formal mathematical treatment of the issue. - While knowledge conflict is identified as a key challenge, the paper addresses it in a rather ad hoc manner compared to other challenges, which are supported by theoretical analysis. - My biggest concern is on the privacy part of the model. Although

Reviewer 03Rating 6Confidence 4

Strengths

* The paper identifies and addresses a novel problem of knowledge editing in federated learning for LLMs, a new setting within model editing research. * The authors propose a straightforward yet effective method—COLLABEDIT—that enables privacy-preserving collaborative editing, which is an essential consideration in multi-party learning scenarios. * Experiments on GPT-J and GPT2-XL show that COLLABEDIT can substantially improve performance over methods like MEMIT in federated settings, highlighti

Weaknesses

* The need for collaborative knowledge editing within federated LLM may be limited, as large-scale federated LLM scenarios are currently uncommon. This reduces the perceived applicability and impact of the problem being solved. * The experiments are conducted on older models like GPT-J and GPT2-XL. More recent models such as LLaMA-2, LLaMA-3, or Gemma would provide stronger validation of the proposed method’s efficacy. * The paper’s structure could benefit from refinement, as some figures and ta

Code & Models

Repositories

lins-lab/collabedit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies

MethodsSoftmax · Attention Is All You Need