LLM-Human Pipeline for Cultural Context Grounding of Conversations
Rajkumar Pujari, Dan Goldwasser

TL;DR
This paper introduces a novel framework combining large language models and human input to create a culturally aware conversational dataset, improving NLP models' understanding of social norms across cultures.
Contribution
It presents a new Cultural Context Schema and a large, annotated dataset for cultural norms in conversations, enhancing NLP models' cultural awareness.
Findings
Generated ~110k social norm descriptions for Chinese conversations.
Refined descriptions using automated verification against human judgments.
Improved performance in emotion, sentiment, and dialogue act detection tasks.
Abstract
Conversations often adhere to well-understood social norms that vary across cultures. For example, while "addressing parents by name" is commonplace in the West, it is rare in most Asian cultures. Adherence or violation of such norms often dictates the tenor of conversations. Humans are able to navigate social situations requiring cultural awareness quite adeptly. However, it is a hard task for NLP models. In this paper, we tackle this problem by introducing a "Cultural Context Schema" for conversations. It comprises (1) conversational information such as emotions, dialogue acts, etc., and (2) cultural information such as social norms, violations, etc. We generate ~110k social norm and violation descriptions for ~23k conversations from Chinese culture using LLMs. We refine them using automated verification strategies which are evaluated against culturally aware human judgements. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems
MethodsAttentive Walk-Aggregating Graph Neural Network
