Online Algorithms for Estimating Change Rates of Web Pages
Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe

TL;DR
This paper introduces three efficient online algorithms for estimating web page change rates, improving cache freshness in search engines under bandwidth constraints, with proven convergence and superior performance over existing methods.
Contribution
The paper presents three novel low-complexity online estimation schemes for page change rates, including the first convergence proof for stochastic heavy-ball methods without bounded gradients or noise.
Findings
The proposed estimators outperform MLE in accuracy and speed.
All schemes converge asymptotically with known rates.
Algorithms are applicable to database synchronization and network inventory management.
Abstract
A search engine maintains local copies of different web pages to provide quick search results. This local cache is kept up-to-date by a web crawler that frequently visits these different pages to track changes in them. Ideally, the local copy should be updated as soon as a page changes on the web. However, finite bandwidth availability and server restrictions limit how frequently different pages can be crawled. This brings forth the following optimization problem: maximize the freshness of the local cache subject to the crawling frequencies being within prescribed bounds. While tractable algorithms do exist to solve this problem, these either assume the knowledge of exact page change rates or use inefficient methods such as MLE for estimating the same. We address this issue here. We provide three novel schemes for online estimation of page change rates, all of which have extremely low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
