A "Perspectival" Mirror of the Elephant: Investigating Language Bias on Google, ChatGPT, YouTube, and Wikipedia
Queenie Luo, Michael J. Puett, Michael D. Smith

TL;DR
This paper investigates how Google, Wikipedia, YouTube, and ChatGPT exhibit language bias, reflecting culturally dominant views tied to search language, which creates an invisible cultural barrier online and limits exposure to diverse perspectives.
Contribution
It provides empirical evidence of language bias in major online platforms and discusses its social implications, highlighting the narrow cultural reflection in search results across languages.
Findings
Search results vary significantly across languages.
Major platforms reflect culturally dominant views.
Language bias creates an invisible cultural barrier.
Abstract
Contrary to Google Search's mission of delivering information from "many angles so you can form your own understanding of the world," we find that Google and its most prominent returned results - Wikipedia and YouTube - simply reflect a narrow set of culturally dominant views tied to the search language for complex topics like "Buddhism," "Liberalism," "colonization," "Iran" and "America." Simply stated, they present, to varying degrees, distinct information across the same search in different languages, a phenomenon we call language bias. This paper presents evidence and analysis of language bias and discusses its larger social implications. We find that our online searches and emerging tools like ChatGPT turn us into the proverbial blind person touching a small portion of an elephant, ignorant of the existence of other cultural perspectives. Language bias sets a strong yet invisible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Linguistics, Language Diversity, and Identity · Hate Speech and Cyberbullying Detection
