Efficient Knowledge Editing via Minimal Precomputation
Akshat Gupta, Maochuan Lu, Thomas Hartvigsen, Gopala Anumanchipalli

TL;DR
This paper demonstrates that the extensive precomputation step in knowledge editing methods like MEMIT can be drastically reduced from millions of vectors to less than 0.3%, significantly decreasing setup time and enabling faster model editing.
Contribution
The authors establish the theoretical minimum precomputation needed and empirically show that substantially fewer hidden vectors are sufficient for effective knowledge editing.
Findings
Precomputation can be reduced to less than 0.3% of original requirement.
Significant time savings enable model editing within minutes.
Theoretical analysis supports minimal precomputation sufficiency.
Abstract
Knowledge editing methods like MEMIT are able to make data and compute efficient updates of factual knowledge by using a single sentence to update facts and their consequences. However, what is often overlooked is a "precomputation step", which requires a one-time but significant computational cost. The authors of MEMIT originally precompute approximately 44 million hidden vectors per edited layer, which requires a forward pass over 44 million tokens. For GPT-J (6B), this precomputation step takes 36 hours on a single GPU, while it takes approximately 40 hours for Llama2-7B. Additionally, this precomputation time grows with model size. In this paper, we show that this excessive computational cost is unnecessary. Knowledge editing using MEMIT and related methods, such as ROME and EMMET, can be performed by pre-computing a very small portion of the 44 million hidden vectors. We first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Scientific Computing and Data Management · Advanced Graph Neural Networks
