Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
M\'arton Mesty\'an, Taha Yasseri, J\'anos Kert\'esz

TL;DR
This paper presents a model that predicts a movie's box office success early by analyzing Wikipedia activity data, demonstrating the potential of social big data for real-time societal trend prediction.
Contribution
It introduces a minimalistic predictive model linking Wikipedia activity to movie success, enabling early forecasts before release.
Findings
Wikipedia activity correlates with movie success
Predictions can be made before movie release
Social data provides early indicators of popularity
Abstract
Use of socially generated "big data" to access information about collective states of the minds in human societies has become a new paradigm in the emerging field of computational social science. A natural application of this would be the prediction of the society's reaction to a new product in the sense of popularity and adoption rate. However, bridging the gap between "real time monitoring" and "early predicting" remains a big challenge. Here we report on an endeavor to build a minimalistic predictive model for the financial success of movies based on collective activity data of online users. We show that the popularity of a movie can be predicted much before its release by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
