A multimodal approach for multi-label movie genre classification
Rafael B. Mangolin, Rodolfo M. Pereira, Alceu S. Britto Jr., Carlos N., Silla Jr., Val\'eria D. Feltrim, Diego Bertolini, Yandre M. G. Costa

TL;DR
This paper presents a comprehensive multimodal dataset and approach for multi-label movie genre classification, combining video, text, and audio features with various classifiers and fusion strategies, achieving promising results.
Contribution
It introduces a large, curated multimodal dataset and evaluates multiple feature extraction and classifier fusion methods for improved genre classification.
Findings
Fusion of LSTM and CNN classifiers yields best results.
Multimodal features outperform single-source features.
Complementarity among different data sources enhances classification performance.
Abstract
Movie genre classification is a challenging task that has increasingly attracted the attention of researchers. In this paper, we addressed the multi-label classification of the movie genres in a multimodal way. For this purpose, we created a dataset composed of trailer video clips, subtitles, synopses, and movie posters taken from 152,622 movie titles from The Movie Database. The dataset was carefully curated and organized, and it was also made available as a contribution of this work. Each movie of the dataset was labeled according to a set of eighteen genre labels. We extracted features from these data using different kinds of descriptors, namely Mel Frequency Cepstral Coefficients, Statistical Spectrum Descriptor , Local Binary Pattern with spectrograms, Long-Short Term Memory, and Convolutional Neural Networks. The descriptors were evaluated using different classifiers, such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
