Scalable Knowledge Editing for Mixture-of-Experts LLMs via Tensor-Structured Updates

Roman Maksimov; Vladimir Aletov; Dmitry Bylinkin; Daniil Medyakov; Vladimir Solodkin; Aleksandr Beznosikov

arXiv:2605.16686·cs.LG·May 19, 2026

Scalable Knowledge Editing for Mixture-of-Experts LLMs via Tensor-Structured Updates

Roman Maksimov, Vladimir Aletov, Dmitry Bylinkin, Daniil Medyakov, Vladimir Solodkin, Aleksandr Beznosikov

PDF

TL;DR

This paper introduces a scalable, tensor-structured knowledge editing method for Mixture-of-Experts LLMs that is efficient, accurate, and extends KE capabilities to sparse architectures.

Contribution

It develops a novel tensor-based framework for KE in MoE LLMs, leveraging the Woodbury identity for efficient low-rank updates, matching quality of strong baselines while being significantly faster.

Findings

01

Matches baseline KE quality on main metrics

02

Accelerates editing by up to 6x

03

Extends KE to sparse MoE architectures

Abstract

Knowledge editing (KE) provides a lightweight alternative to repeated fine-tuning of LLMs. However, most existing KE methods target dense feed-forward layers, while modern LLMs increasingly adopt Mixture-of-Experts (MoE) architectures for their superior memory footprint and inference efficiency. This mismatch leaves a growing class of production models without principled editing tools. We propose a MEMIT-like framework for knowledge editing in MoE-based LLMs. Our method exploits the tensor structure of MoE layers to formulate the editing objective faithfully at the per expert level, and applies the Woodbury matrix identity to avoid materializing or inverting the full stacked matrix of expert weights. The resulting update reduces to inversions of fixed low-rank matrices and requires no additional backward passes. Empirically, our approach matches the editing quality of strong baselines…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.