StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes
Awantee Deshpande, Dana Ruiter, Marius Mosbach, Dietrich Klakow

TL;DR
This paper introduces StereoKG, a data-driven method to automatically construct a knowledge graph of cultural stereotypes, enhancing fairness and bias detection in NLP models without relying on manual lists.
Contribution
The study presents a fully automated pipeline for building a cultural knowledge graph covering multiple groups, improving bias analysis and model cultural awareness.
Findings
Majority of stereotypes are coherent and complete according to human evaluation
Training language models on verbalized KG enhances cultural awareness
Model performance on hate speech detection improves with KG-based training
Abstract
Analyzing ethnic or religious bias is important for improving fairness, accountability, and transparency of natural language processing models. However, many techniques rely on human-compiled lists of bias terms, which are expensive to create and are limited in coverage. In this study, we present a fully data-driven pipeline for generating a knowledge graph (KG) of cultural knowledge and stereotypes. Our resulting KG covers 5 religious groups and 5 nationalities and can easily be extended to include more entities. Our human evaluation shows that the majority (59.2%) of non-singleton entries are coherent and complete stereotypes. We further show that performing intermediate masked language model training on the verbalized KG leads to a higher level of cultural awareness in the model and has the potential to increase classification performance on knowledge-crucial samples on a related…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReligious Education and Schools · Terrorism, Counterterrorism, and Political Violence
