Reconstructing Detailed Browsing Activities from Browser History
Geza Kovacs

TL;DR
This paper presents a machine learning approach to accurately reconstruct detailed user browsing activities, including time spent and focused tabs, solely from incomplete browser history data, enabling insights without extensive monitoring tools.
Contribution
The authors develop a novel machine learning method that predicts user focus and activity at second-level granularity from browser history, achieving high accuracy and enabling detailed activity reconstruction.
Findings
F1-score of 0.84 for focus prediction
76.2% accuracy in domain identification
R^2 of 0.96 for total online time reconstruction
Abstract
Users' detailed browsing activity - such as what sites they are spending time on and for how long, and what tabs they have open and which one is focused at any given time - is useful for a number of research and practical applications. Gathering such data, however, requires that users install and use a monitoring tool over long periods of time. In contrast, browser extensions can gain instantaneous access months of browser history data. However, the browser history is incomplete: it records only navigation events, missing important information such as time spent or tab focused. In this work, we aim to reconstruct time spent on sites with only users' browsing histories. We gathered three months of browsing history and two weeks of ground-truth detailed browsing activity from 185 participants. We developed a machine learning algorithm that predicts whether the browser window is focused…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersonal Information Management and User Behavior · Recommender Systems and Techniques · Web Data Mining and Analysis
