Characterising Web Site Link Structure
Shi Zhou, Ingemar Cox, Vaclav Petricek

TL;DR
This study investigates the topological link structures within individual web sites, revealing a common third-order property across diverse sites that distinguishes them from other network types and models.
Contribution
It identifies a universal third-order topological property in web sites, contrasting with other networks and generative models, and provides a detailed statistical analysis of site structures.
Findings
Third-order property is consistent across diverse web sites.
Web sites differ significantly in first and second-order properties.
The third-order property is unique to web sites and not found in other networks.
Abstract
The topological structures of the Internet and the Web have received considerable attention. However, there has been little research on the topological properties of individual web sites. In this paper, we consider whether web sites (as opposed to the entire Web) exhibit structural similarities. To do so, we exhaustively crawled 18 web sites as diverse as governmental departments, commercial companies and university departments in different countries. These web sites consisted of as little as a few thousand pages to millions of pages. Statistical analysis of these 18 sites revealed that the internal link structure of the web sites are significantly different when measured with first and second-order topological properties, i.e. properties based on the connectivity of an individual or a pairs of nodes. However, examination of a third-order topological property that consider the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
