MovieSum: An Abstractive Summarization Dataset for Movie Screenplays
Rohit Saxena, Frank Keller

TL;DR
MovieSum introduces a new, large dataset of movie screenplays and summaries to advance research in abstractive screenplay summarization, addressing challenges of long input processing and structural understanding.
Contribution
The paper presents MovieSum, a novel dataset of 2200 movie screenplays with summaries, metadata, and structural formatting, enabling improved research in long-document summarization.
Findings
Large language models perform baseline summarization on MovieSum.
MovieSum is larger and more detailed than previous datasets.
Metadata facilitates access to external knowledge.
Abstract
Movie screenplay summarization is challenging, as it requires an understanding of long input contexts and various elements unique to movies. Large language models have shown significant advancements in document summarization, but they often struggle with processing long input contexts. Furthermore, while television transcripts have received attention in recent studies, movie screenplay summarization remains underexplored. To stimulate research in this area, we present a new dataset, MovieSum, for abstractive summarization of movie screenplays. This dataset comprises 2200 movie screenplays accompanied by their Wikipedia plot summaries. We manually formatted the movie screenplays to represent their structural elements. Compared to existing datasets, MovieSum possesses several distinctive features: (1) It includes movie screenplays, which are longer than scripts of TV episodes. (2) It is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Video Analysis and Summarization · Generative Adversarial Networks and Image Synthesis
MethodsSoftmax · Attention Is All You Need
