KAHANI: Culturally-Nuanced Visual Storytelling Tool for Non-Western Cultures
Hamna, Deepthi Sudharsan, Agrima Seth, Ritvik Budhiraja and, Deepika Khullar, Vyshak Jain, Kalika Bali, Aditya Vashistha and, Sameer Segal

TL;DR
Kahani is a culturally nuanced visual storytelling tool for non-Western cultures that leverages GPT-4 Turbo and SDXL to generate culturally grounded stories, outperforming general models in user evaluations.
Contribution
This paper introduces Kahani, a novel tool that generates culturally specific visual stories for non-Western cultures using advanced prompting techniques.
Findings
Kahani produces more culturally nuanced stories than ChatGPT-4 with DALL-E3.
In 75% of comparisons, Kahani outperformed or matched ChatGPT-4.
User study confirms Kahani's effectiveness in capturing cultural nuances.
Abstract
Large Language Models (LLMs) and Text-To-Image (T2I) models have demonstrated the ability to generate compelling text and visual stories. However, their outputs are predominantly aligned with the sensibilities of the Global North, often resulting in an outsider's gaze on other cultures. As a result, non-Western communities have to put extra effort into generating culturally specific stories. To address this challenge, we developed a visual storytelling tool called Kahani that generates culturally grounded visual stories for non-Western cultures. Our tool leverages off-the-shelf models GPT-4 Turbo and Stable Diffusion XL (SDXL). By using Chain of Thought (CoT) and T2I prompting techniques, we capture the cultural context from user's prompt and generate vivid descriptions of the characters and scene compositions. To evaluate the effectiveness of Kahani, we conducted a comparative user…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Storytelling and Education
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Byte Pair Encoding · Layer Normalization · Residual Connection · Multi-Head Attention · Softmax · Adam
