Generative AI and the Digital Commons
Saffron Huang, Divya Siddarth

TL;DR
This paper discusses how generative foundation models impact the digital commons, highlighting risks and proposing governance solutions to ensure sustainable and equitable use of publicly available data.
Contribution
It introduces a novel governance framework addressing the unique challenges of GFMs trained on commons-based data, beyond traditional data rights models.
Findings
GFMs may degrade the digital commons
Proposes transparency and shared ownership models
Recommends standards for data contribution and monitoring
Abstract
Many generative foundation models (or GFMs) are trained on publicly available data and use public infrastructure, but 1) may degrade the "digital commons" that they depend on, and 2) do not have processes in place to return value captured to data producers and stakeholders. Existing conceptions of data rights and protection (focusing largely on individually-owned data and associated privacy concerns) and copyright or licensing-based models offer some instructive priors, but are ill-suited for the issues that may arise from models trained on commons-based data. We outline the risks posed by GFMs and why they are relevant to the digital commons, and propose numerous governance-based solutions that include investments in standardized dataset/model disclosure and other kinds of transparency when it comes to generative models' training and capabilities, consortia-based funding for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI
