What's in a Session: Tracking Individual Behavior on the Web
Mark Meiss, John Duncan, Bruno Gon\c{c}alves, Jos\'e J. Ramasco,, Filippo Menczer

TL;DR
This study analyzes detailed web browsing data from undergraduates to reveal properties of web traffic, user behavior, and session definitions, challenging traditional session segmentation methods and proposing a logical, referrer-based approach.
Contribution
It introduces a new logical definition of web sessions based on referrer URLs and demonstrates its advantages over timeout-based methods.
Findings
Web site popularity distribution can be unbounded and lacks a mean.
Individual browsing behavior often follows a log-normal distribution.
Timeout-based session segmentation significantly alters browsing statistics.
Abstract
We examine the properties of all HTTP requests generated by a thousand undergraduates over a span of two months. Preserving user identity in the data set allows us to discover novel properties of Web traffic that directly affect models of hypertext navigation. We find that the popularity of Web sites -- the number of users who contribute to their traffic -- lacks any intrinsic mean and may be unbounded. Further, many aspects of the browsing behavior of individual users can be approximated by log-normal distributions even though their aggregate behavior is scale-free. Finally, we show that users' click streams cannot be cleanly segmented into sessions using timeouts, affecting any attempt to model hypertext navigation using statistics of individual sessions. We propose a strictly logical definition of sessions based on browsing activity as revealed by referrer URLs; a user may have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Web Data Mining and Analysis · Peer-to-Peer Network Technologies
