# Tempas: Temporal Archive Search Based on Tags

**Authors:** Helge Holzmann, Avishek Anand

arXiv: 1702.01076 · 2017-02-06

## TL;DR

Tempas is a novel tag-based temporal search engine for Web archives that leverages external social bookmarking data to improve retrieval and understanding of user access patterns over time.

## Contribution

It introduces a low-overhead, tag and time indexing framework that enhances temporal search capabilities in Web archives using external longitudinal resources.

## Key findings

- Enables temporal search with tag-based indexing.
- Ranks documents based on popularity within time windows.
- Provides query recommendations using tag co-occurrence statistics.

## Abstract

Limited search and access patterns over Web archives have been well documented. One of the key reasons is the lack of understanding of the user access patterns over such collections, which in turn is attributed to the lack of effective search interfaces. Current search interfaces for Web archives are (a) either purely navigational or (b) have sub-optimal search experience due to ineffective retrieval models or query modeling. We identify that external longitudinal resources, such as social bookmarking data, are crucial sources to identify important and popular websites in the past. To this extent we present Tempas, a tag-based temporal search engine for Web archives.   Websites are posted at specific times of interest on several external platforms, such as bookmarking sites like Delicious. Attached tags not only act as relevant descriptors useful for retrieval, but also encode the time of relevance. With Tempas we tackle the challenge of temporally searching a Web archive by indexing tags and time. We allow temporal selections for search terms, rank documents based on their popularity and also provide meaningful query recommendations by exploiting tag-tag and tag-document co-occurrence statistics in arbitrary time windows. Finally, Tempas operates as a fairly non-invasive indexing framework. By not dealing with contents from the actual Web archive it constitutes an attractive and low-overhead approach for quick access into Web archives.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.01076/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1702.01076/full.md

## References

9 references — full list in the complete paper: https://tomesphere.com/paper/1702.01076/full.md

---
Source: https://tomesphere.com/paper/1702.01076