We Should Separate Memorization from Copyright
Adi Haviv, Niva Elkin-Koren, Uri Hacohen, Roi Livni, Shay Moran

TL;DR
This paper argues that memorization in foundation models should be distinguished from copyright infringement, proposing a risk-based evaluation approach aligned with legal standards to improve clarity in technical and legal discussions.
Contribution
It clarifies the distinction between memorization and copying in foundation models and advocates for an output-level, risk-based evaluation method aligned with copyright law.
Findings
Memorization should not be used as a proxy for copyright infringement.
Technical signals for infringement differ from lawful generalization.
A risk-based evaluation process can better align technical assessments with legal standards.
Abstract
The widespread use of foundation models has introduced a new risk factor of copyright issue. This issue is leading to an active, lively and on-going debate amongst the data-science community as well as amongst legal scholars. Where claims and results across both sides are often interpreted in different ways and leading to different implications. Our position is that much of the technical literature relies on traditional reconstruction techniques that are not designed for copyright analysis. As a result, memorization and copying have been conflated across both technical and legal communities and in multiple contexts. We argue that memorization, as commonly studied in data science, should not be equated with copying and should not be used as a proxy for copyright infringement. We distinguish technical signals that meaningfully indicate infringement risk from those that instead reflect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCopyright and Intellectual Property · Law, AI, and Intellectual Property · Intellectual Property and Patents
