Analyzing Folktales of Different Regions Using Topic Modeling and Clustering
Jacob Werzinsky, Zhiyan Zhong, Xuedan Zou

TL;DR
This study uses topic modeling and clustering to analyze folktales from various regions, revealing cultural similarities and differences, and demonstrating the effectiveness of NLP techniques in cultural document analysis.
Contribution
It introduces a novel application of topic modeling and clustering to compare folktales across regions, uncovering cultural patterns and regional differences.
Findings
Common themes include family, food, gender roles, mythological figures, and animals.
Regional differences in folktale topics relate to local environment and animals.
European and Asian folktales often share similarities.
Abstract
This paper employs two major natural language processing techniques, topic modeling and clustering, to find patterns in folktales and reveal cultural relationships between regions. In particular, we used Latent Dirichlet Allocation and BERTopic to extract the recurring elements as well as K-means clustering to group folktales. Our paper tries to answer the question what are the similarities and differences between folktales, and what do they say about culture. Here we show that the common trends between folktales are family, food, traditional gender roles, mythological figures, and animals. Also, folktales topics differ based on geographical location with folktales found in different regions having different animals and environment. We were not surprised to find that religious figures and animals are some of the common topics in all cultures. However, we were surprised that European and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Folklore, Mythology, and Literature Studies
Methodsk-Means Clustering
