LiveRank: How to Refresh Old Datasets
The Dang Huynh (LINCS), Fabien Mathieu (LINCS), Laurent Viennot (GANG,, LINCS)

TL;DR
This paper introduces LiveRank, a method for efficiently identifying active nodes in old datasets by ranking nodes to minimize queries, with applications to web pages and social networks.
Contribution
It proposes a new ranking approach called LiveRank that improves the efficiency of detecting active nodes in outdated datasets, leveraging PageRank for better performance.
Findings
Building on PageRank yields efficient LiveRanks.
LiveRank reduces the number of queries needed to find active nodes.
The approach works for both web graphs and social networks.
Abstract
This paper considers the problem of refreshing a dataset. More precisely , given a collection of nodes gathered at some time (Web pages, users from an online social network) along with some structure (hyperlinks, social relationships), we want to identify a significant fraction of the nodes that still exist at present time. The liveness of an old node can be tested through an online query at present time. We call LiveRank a ranking of the old pages so that active nodes are more likely to appear first. The quality of a LiveRank is measured by the number of queries necessary to identify a given fraction of the active nodes when using the LiveRank order. We study different scenarios from a static setting where the Liv-eRank is computed before any query is made, to dynamic settings where the LiveRank can be updated as queries are processed. Our results show that building on the PageRank can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
