Presenting a Larger Up-to-date Movie Dataset and Investigating the   Effects of Pre-released Attributes on Gross Revenue

Arnab Sen Sharma; Tirtha Roy; Sadique Ahmmod Rifat; Maruf Ahmed Mridul

arXiv:2110.07039·cs.IR·December 8, 2021

Presenting a Larger Up-to-date Movie Dataset and Investigating the Effects of Pre-released Attributes on Gross Revenue

Arnab Sen Sharma, Tirtha Roy, Sadique Ahmmod Rifat, Maruf Ahmed Mridul

PDF

1 Repo

TL;DR

This paper introduces a large, updated movie dataset and analyzes how pre-release attributes influence box office revenue, using statistical methods and machine learning to predict earnings and classify movies.

Contribution

The work provides a comprehensive, up-to-date dataset and a novel star power metric, enabling improved revenue prediction models for movies.

Findings

01

Star cast and director positively impact revenue.

02

A set of key attributes can predict revenue with 60% accuracy.

03

Publicly available datasets and tools facilitate further research.

Abstract

Movie-making has become one of the most costly and risky endeavors in the entertainment industry. Continuous change in the preference of the audience makes it harder to predict what kind of movie will be financially successful at the box office. So, it is no wonder that cautious, intelligent stakeholders and large production houses will always want to know the probable revenue that will be generated by a movie before making an investment. Researchers have been working on finding an optimal strategy to help investors in making the right decisions. But the lack of a large, up-to-date dataset makes their work harder. In this work, we introduce an up-to-date, richer, and larger dataset that we have prepared by scraping IMDb for researchers and data analysts to work with. The compiled dataset contains the summery data of 7.5 million titles and detail information of more than 200K movies.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arnab-api/movie-analysis
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.