ResourceSync: Leveraging Sitemaps for Resource Synchronization
Bernhard Haslhofer, Simeon Warner, Carl Lagoze, Martin Klein, Robert, Sanderson, Michael L. Nelson, Herbert van de Sompel

TL;DR
ResourceSync is a standardized protocol that uses XML Sitemaps to enable efficient and modular synchronization of web resources, demonstrated through implementations on arXiv.org and Wikipedia.
Contribution
It introduces a general, modular Web resource synchronization protocol leveraging XML Sitemaps, addressing the lack of standardized solutions.
Findings
Successful implementation on arXiv.org
Prototype developed for English Wikipedia
Client API provided for resource synchronization
Abstract
Many applications need up-to-date copies of collections of changing Web resources. Such synchronization is currently achieved using ad-hoc or proprietary solutions. We propose ResourceSync, a general Web resource synchronization protocol that leverages XML Sitemaps. It provides a set of capabilities that can be combined in a modular manner to meet local or community requirements. We report on work to implement this protocol for arXiv.org and also provide an experimental prototype for the English Wikipedia as well as a client API.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsService-Oriented Architecture and Web Services · Web Data Mining and Analysis · Semantic Web and Ontologies
