How Contentious Terms About People and Cultures are Used in Linked Open Data
Andrei Nesterov (1), Laura Hollink (1), Jacco van Ossenbruggen (2), ((1) Centrum Wiskunde & Informatica, (2) VU University Amsterdam)

TL;DR
This study analyzes the occurrence and marking of contentious terms about people and cultures in linked open data, revealing frequent use of outdated stereotypes and inconsistent marking practices across datasets.
Contribution
It provides a systematic analysis of contentious term usage in LOD and highlights the rarity and inconsistency of explicit markers for such terms.
Findings
Contentious terms frequently appear in descriptive literals.
Marking of contentious terms is rare and inconsistent.
Outdated stereotypes are often propagated through LOD literals.
Abstract
Web resources in linked open data (LOD) are comprehensible to humans through literal textual values attached to them, such as labels, notes, or comments. Word choices in literals may not always be neutral. When outdated and culturally stereotyping terminology is used in literals, they may appear as offensive to users in interfaces and propagate stereotypes to algorithms trained on them. We study how frequently and in which literals contentious terms about people and cultures occur in LOD and whether there are attempts to mark the usage of such terms. For our analysis, we reuse English and Dutch terms from a knowledge graph that provides opinions of experts from the cultural heritage domain about terms' contentiousness. We inspect occurrences of these terms in four widely used datasets: Wikidata, The Getty Art & Architecture Thesaurus, Princeton WordNet, and Open Dutch WordNet. Some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Wikis in Education and Collaboration
MethodsSparse Evolutionary Training
