Macroscopic and microscopic statistical properties observed in blog entries
Yukie Sano, Misako Takayasu

TL;DR
This paper analyzes the statistical properties of blog entries, revealing that small word frequencies follow a Poisson process and confirming the universality of Zipf's law across different blogger behaviors.
Contribution
It introduces a normalization method to accurately evaluate word frequencies and identifies two distinct blogger behaviors, confirming Zipf's law's universality.
Findings
Small word frequencies follow a Poisson process.
Two types of blogger behaviors are identified.
Zipf's law is universally applicable.
Abstract
We observe the statistical properties of blogs that are expected to reflect social human interaction. Firstly, we introduce a basic normalization preprocess that enables us to evaluate the genuine word frequency in blogs that are independent of external factors such as spam blogs, server-breakdowns, increase in the population of bloggers, and periodic weekly behaviors. After this process, we can confirm that small frequency words clearly follow an independent Poisson process as theoretically expected. Secondly, we focus on each blogger's basic behaviors. It is found that there are two kinds of behaviors of bloggers. Further, Zipf's law on word frequency is confirmed to be universally independent of individual activity types.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Opinion Dynamics and Social Influence · Complex Systems and Time Series Analysis
