GRIN Transfer: A production-ready tool for libraries to retrieve digital copies from Google Books
Liza Daly, Matteo Cargnelutti, Catherine Brobston, John Hess, Greg Leppert, Amanda Watson, Jonathan Zittrain

TL;DR
This paper introduces GRIN Transfer, an open-source Python tool that enables libraries to efficiently retrieve, structure, and enhance their Google Books collections from the GRIN platform, addressing previous challenges like rate-limiting.
Contribution
The paper presents the initial release of GRIN Transfer, a production-ready pipeline that improves robustness and usability for libraries accessing Google Books data from GRIN.
Findings
GRIN Transfer streamlines collection retrieval from GRIN.
It enhances data structuring and metadata integration.
The tool is adaptable for various library environments.
Abstract
Publicly launched in 2004, the Google Books project has scanned tens of millions of items in partnership with libraries around the world. As part of this project, Google created the Google Return Interface (GRIN). Through this platform, libraries can access their scanned collections, the associated metadata, and the ongoing OCR and metadata improvements that become available as Google reprocesses these collections using new technologies. When downloading the Harvard Library Google Books collection from GRIN to develop the Institutional Books dataset, we encountered several challenges related to rate-limiting and atomized metadata within the GRIN platform. To overcome these challenges and help other libraries make more robust use of their Google Books collections, this technical report introduces the initial release of GRIN Transfer. This open-source and production-ready Python pipeline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLibrary Collection Development and Digital Resources · Digital Humanities and Scholarship · Library Science and Information Systems
