Improving Website Hyperlink Structure Using Server Logs
Ashwin Paranjape, Robert West, Leila Zia, Jure Leskovec

TL;DR
This paper presents a data-driven approach using server logs to automatically identify and suggest useful new hyperlinks for websites, improving navigation and reducing manual editing efforts.
Contribution
It introduces a novel method leveraging server logs to predict beneficial nonexistent links and formulates an efficient link placement algorithm under budget constraints.
Findings
Effective on Wikipedia, matching actual new links in revision history.
Applicable to other websites like Simtk with standard server logs.
Improves website navigation without manual link curation.
Abstract
Good websites should be easy to navigate via hyperlinks, yet maintaining a high-quality link structure is difficult. Identifying pairs of pages that should be linked may be hard for human editors, especially if the site is large and changes frequently. Further, given a set of useful link candidates, the task of incorporating them into the site can be expensive, since it typically involves humans editing pages. In the light of these challenges, it is desirable to develop data-driven methods for automating the link placement task. Here we develop an approach for automatically finding useful hyperlinks to add to a website. We show that passively collected server logs, beyond telling us which existing links are useful, also contain implicit signals indicating which nonexistent links would be useful if they were to be introduced. We leverage these signals to model the future usefulness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
