A Framework for Evaluation of Composite Memento Temporal Coherence
Scott G. Ainsworth, Michael L. Nelson, Herbert Van de Sompel

TL;DR
This paper presents a framework to evaluate the temporal coherence of embedded web resources in archived HTML pages, considering their capture times and other metadata to improve archival playback accuracy.
Contribution
It introduces a novel framework for assessing temporal coherence between root and embedded resources in web archives, based on multiple metadata indicators.
Findings
Embedded resources can have Memento-Datetimes years apart from root resources.
The framework enables more accurate evaluation of archival fidelity.
It highlights the importance of considering multiple metadata for coherence assessment.
Abstract
Most archived HTML pages embed other web resources, such as images and stylesheets. Playback of the archived web pages typically provides only the capture date (or Memento-Datetime) of the root resource and not the Memento-Datetime of the embedded resources. In the course of our research, we have discovered that the Memento-Datetime of embedded resources can be up to several years in the future or past, relative to the Memento-Datetime of the embedding root resource. We introduce a framework for assessing temporal coherence between a root resource and its embedded resource depending on Memento-Datetime, Last-Modified datetime, and entity body.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Advanced Data Storage Technologies · Scientific Computing and Data Management
