RippleBench: Capturing Ripple Effects Using Existing Knowledge Repositories
Roy Rinberg, Usha Bhalla, Igor Shilov, Flavio P. Calmon, Rohit Gandikota

TL;DR
RippleBench introduces an automated framework and benchmark for measuring ripple effects in language model interventions, revealing that current methods cause unintended knowledge propagation across related topics.
Contribution
The paper presents RippleBench-Maker, an automatic tool for generating ripple effect datasets, and RippleBench-Bio, a new benchmark for evaluating knowledge unlearning in language models.
Findings
All evaluated unlearning methods cause accuracy drops on distant topics.
Ripple effects vary across different unlearning techniques.
The framework enables systematic measurement of ripple effects.
Abstract
Targeted interventions on language models, such as unlearning, debiasing, or model editing, are a central method for refining model behavior and keeping knowledge up to date. While these interventions aim to modify specific information within models (e.g., removing virology content), their effects often propagate to related but unintended areas (e.g., allergies); these side-effects are commonly referred to as the ripple effect. In this work, we present RippleBench-Maker, an automatic tool for generating Q&A datasets that allow for the measurement of ripple effects in any model-editing task. RippleBench-Maker builds on a Wikipedia-based RAG pipeline (WikiRAG) to generate multiple-choice questions at varying semantic distances from the target concept (e.g., the knowledge being unlearned). Using this framework, we construct RippleBench-Bio, a benchmark derived from the WMDP (Weapons of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Topic Modeling · Misinformation and Its Impacts
