(De)-Indexing and the Right to be Forgotten
Salvatore Vilella, Giancarlo Ruffo

TL;DR
This paper explores the technical challenges of implementing the right to be forgotten in search engines, discussing IR models and the potential role of LLMs in managing personal data removal.
Contribution
It provides an accessible overview of IR concepts and models relevant to de-indexing and the right to be forgotten, highlighting current challenges and future directions.
Findings
IR models are fundamental to de-indexing processes
LLMs can enhance data processing for RTBF
Balancing privacy and search efficiency remains complex
Abstract
In the digital age, the challenge of forgetfulness has emerged as a significant concern, particularly regarding the management of personal data and its accessibility online. The right to be forgotten (RTBF) allows individuals to request the removal of outdated or harmful information from public access, yet implementing this right poses substantial technical difficulties for search engines. This paper aims to introduce non-experts to the foundational concepts of information retrieval (IR) and de-indexing, which are critical for understanding how search engines can effectively "forget" certain content. We will explore various IR models, including boolean, probabilistic, vector space, and embedding-based approaches, as well as the role of Large Language Models (LLMs) in enhancing data processing capabilities. By providing this overview, we seek to highlight the complexities involved in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship
