Modeling the Sacred: Considerations when Using Religious Texts in Natural Language Processing
Ben Hutchinson

TL;DR
This paper discusses the ethical and cultural considerations of using religious texts in NLP, emphasizing the importance of understanding data provenance, cultural context, and researcher positionality to avoid biases and misrepresentation.
Contribution
It highlights the ethical implications and calls for greater awareness of cultural and religious sensitivities in NLP research involving religious texts.
Findings
Religious texts encode culturally important values.
Translations are often used to supplement scarce data.
Using religious texts raises ethical and cultural considerations.
Abstract
This position paper concerns the use of religious texts in Natural Language Processing (NLP), which is of special interest to the Ethics of NLP. Religious texts are expressions of culturally important values, and machine learned models have a propensity to reproduce cultural values encoded in their training data. Furthermore, translations of religious texts are frequently used by NLP researchers when language data is scarce. This repurposes the translations from their original uses and motivations, which often involve attracting new followers. This paper argues that NLP's use of such texts raises considerations that go beyond model biases, including data provenance, cultural contexts, and their use in proselytism. We argue for more consideration of researcher positionality, and of the perspectives of marginalized linguistic and religious communities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMedia, Religion, Digital Communication
