Intertextual Parallel Detection in Biblical Hebrew: A Transformer-Based Benchmark
David M. Smiley

TL;DR
This paper evaluates transformer-based language models for detecting intertextual parallels in biblical Hebrew, demonstrating that models like E5 and AlephBERT can improve accuracy and efficiency in biblical scholarship.
Contribution
It introduces a benchmark for intertextual parallel detection in biblical Hebrew using transformer models, highlighting their potential in ancient text analysis.
Findings
E5 outperforms in parallel detection
AlephBERT better distinguishes non-parallel passages
Pre-trained models enhance biblical intertextual analysis
Abstract
Identifying parallel passages in biblical Hebrew (BH) is central to biblical scholarship for understanding intertextual relationships. Traditional methods rely on manual comparison, a labor-intensive process prone to human error. This study evaluates the potential of pre-trained transformer-based language models, including E5, AlephBERT, MPNet, and LaBSE, for detecting textual parallels in the Hebrew Bible. Focusing on known parallels between Samuel/Kings and Chronicles, I assessed each model's capability to generate word embeddings distinguishing parallel from non-parallel passages. Using cosine similarity and Wasserstein Distance measures, I found that E5 and AlephBERT show promise; E5 excels in parallel detection, while AlephBERT demonstrates stronger non-parallel differentiation. These findings indicate that pre-trained models can enhance the efficiency and accuracy of detecting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiblical Studies and Interpretation · Freedom of Expression and Defamation · Topic Modeling
MethodsMPNet
