Modeling the Sacred: Considerations when Using Religious Texts in Natural Language Processing

Ben Hutchinson

arXiv:2404.14740·cs.CL·July 23, 2025

Modeling the Sacred: Considerations when Using Religious Texts in Natural Language Processing

Ben Hutchinson

PDF

Open Access 1 Video

TL;DR

This paper discusses the ethical and cultural considerations of using religious texts in NLP, emphasizing the importance of understanding data provenance, cultural context, and researcher positionality to avoid biases and misrepresentation.

Contribution

It highlights the ethical implications and calls for greater awareness of cultural and religious sensitivities in NLP research involving religious texts.

Findings

01

Religious texts encode culturally important values.

02

Translations are often used to supplement scarce data.

03

Using religious texts raises ethical and cultural considerations.

Abstract

This position paper concerns the use of religious texts in Natural Language Processing (NLP), which is of special interest to the Ethics of NLP. Religious texts are expressions of culturally important values, and machine learned models have a propensity to reproduce cultural values encoded in their training data. Furthermore, translations of religious texts are frequently used by NLP researchers when language data is scarce. This repurposes the translations from their original uses and motivations, which often involve attracting new followers. This paper argues that NLP's use of such texts raises considerations that go beyond model biases, including data provenance, cultural contexts, and their use in proselytism. We argue for more consideration of researcher positionality, and of the perspectives of marginalized linguistic and religious communities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Modeling the Sacred: Considerations when Using Religious Texts in Natural Language Processing· underline

Taxonomy

TopicsMedia, Religion, Digital Communication