Large language models for folktale type automation based on motifs: Cinderella case study
Tja\v{s}a Ar\v{c}on, Marko Robnik-\v{S}ikonja, Polona Tratnik

TL;DR
This paper presents a methodology using large language models and NLP to analyze motifs in Cinderella folktale variants, enabling large-scale, cross-lingual folkloristic analysis.
Contribution
It introduces a novel approach combining machine learning and NLP for motif detection and analysis in folklore studies, demonstrating large language models' effectiveness.
Findings
Large language models detect complex interactions in folktales
Method enables cross-lingual comparison of folktale motifs
Facilitates large-scale computational analysis of folklore texts
Abstract
Artificial intelligence approaches are being adapted to many research areas, including digital humanities. We built a methodology for large-scale analyses in folkloristics. Using machine learning and natural language processing, we automatically detected motifs in a large collection of Cinderella variants and analysed their similarities and differences with clustering and dimensionality reduction. The results show that large language models detect complex interactions in tales, enabling computational analysis of extensive text collections and facilitating cross-lingual comparisons.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFolklore, Mythology, and Literature Studies · Language and cultural evolution · Artificial Intelligence in Games
