Web Routineness and Limits of Predictability: Investigating Demographic and Behavioral Differences Using Web Tracking Data
Juhi Kulshrestha, Marcos Oliveira, Orkut Karacalik, Denis Bonnay,, Claudia Wagner

TL;DR
This study reveals that web browsing routines significantly enhance predictability, with individual predictability reaching up to 85%, and demographic factors partly explaining variability across users.
Contribution
We introduce an information-theoretic framework to quantify the limits of web browsing predictability and analyze demographic influences using real tracking data.
Findings
Web routines increase browsing predictability to 85%.
Predictability varies across individuals based on demographics.
Web behavior patterns can be systematically measured and analyzed.
Abstract
Understanding human activities and movements on the Web is not only important for computational social scientists but can also offer valuable guidance for the design of online systems for recommendations, caching, advertising, and personalization. In this work, we demonstrate that people tend to follow routines on the Web, and these repetitive patterns of web visits increase their browsing behavior's achievable predictability. We present an information-theoretic framework for measuring the uncertainty and theoretical limits of predictability of human mobility on the Web. We systematically assess the impact of different design decisions on the measurement. We apply the framework to a web tracking dataset of German internet users. Our empirical results highlight that individual's routines on the Web make their browsing behavior predictable to 85% on average, though the value varies across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Complex Network Analysis Techniques · Data-Driven Disease Surveillance
