The Implications of Open Generative Models in Human-Centered Data Science Work: A Case Study with Fact-Checking Organizations
Robert Wolfe, Tanushree Mitra

TL;DR
This study explores how fact-checking organizations use open and proprietary generative AI models across their data science workflows, highlighting motivations, limitations, and societal implications.
Contribution
It introduces a five-component model of AI use in fact-checking pipelines and analyzes organizational motivations for choosing open versus proprietary models.
Findings
Fact-checkers use open models for autonomy, privacy, and transparency.
Proprietary models are preferred for performance and safety.
Organizations balance open and proprietary models based on specific needs.
Abstract
Calls to use open generative language models in academic research have highlighted the need for reproducibility and transparency in scientific research. However, the impact of generative AI extends well beyond academia, as corporations and public interest organizations have begun integrating these models into their data science pipelines. We expand this lens to include the impact of open models on organizations, focusing specifically on fact-checking organizations, which use AI to observe and analyze large volumes of circulating misinformation, yet must also ensure the reproducibility and impartiality of their work. We wanted to understand where fact-checking organizations use open models in their data science pipelines; what motivates their use of open models or proprietary models; and how their use of open or proprietary models can inform research on the societal impact of generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence · Data Quality and Management · Big Data Technologies and Applications
