Running a Data Integration Lab in the Context of the EHRI Project: Challenges, Lessons Learnt and Future Directions
Herminio Garc\'ia-Gonz\'alez, Mike Bryant, Suzanne Swartz, Fabio, Rovigo, Veerle Vanden Daelen

TL;DR
This paper discusses the experiences, challenges, and lessons learned from running a data integration lab within the EHRI project, aimed at consolidating Holocaust archival sources into a centralized digital platform.
Contribution
It presents a practical model for supporting small institutions in conforming their archival metadata for large-scale data integration efforts.
Findings
Technical challenges in data standardization addressed
Social challenges in engaging small institutions discussed
A replicable framework for data integration proposed
Abstract
Historical study of the Holocaust is commonly hampered by the dispersed and fragmented nature of important archival sources relating to this event. The EHRI project set out to mitigate this problem by building a trans-national network of archives, researchers, and digital practitioners, and one of its main outcomes was the creation of the EHRI Portal, a "virtual observatory" that gathers in one centralised platform descriptions of Holocaust-related archival sources from around the world. In order to build the Portal a strong data identification and integration effort was required, culminating in the project's third phase with the creation of the EHRI-3 data integration lab. The focus of the lab was to lower the bar to participation in the EHRI Portal by providing support to institutions in conforming their archival metadata with that required for integration, ultimately opening the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Research Data Management Practices · Scientific Computing and Data Management
