Interpreting Multilingual and Document-Length Sensitive Relevance   Computations in Neural Retrieval Models through Axiomatic Causal   Interventions

Oliver Savolainen; Dur e Najaf Amjad; Roxana Petcu

arXiv:2505.02154·cs.IR·May 6, 2025

Interpreting Multilingual and Document-Length Sensitive Relevance Computations in Neural Retrieval Models through Axiomatic Causal Interventions

Oliver Savolainen, Dur e Najaf Amjad, Roxana Petcu

PDF

Open Access 1 Repo

TL;DR

This study reproduces and extends previous work on neural retrieval models, demonstrating that they encode term frequency and document length information across languages, with implications for interpretability and reproducibility in IR models.

Contribution

It confirms the encoding of term frequency in neural models across languages and extends analysis to document length, using activation patching for interpretability.

Findings

01

Term frequency encoding generalizes across languages.

02

Activation patching isolates model components responsible for relevance.

03

Later layers encode sequence-level information in CLS tokens.

Abstract

This reproducibility study analyzes and extends the paper "Axiomatic Causal Interventions for Reverse Engineering Relevance Computation in Neural Retrieval Models," which investigates how neural retrieval models encode task-relevant properties such as term frequency. We reproduce key experiments from the original paper, confirming that information on query terms is captured in the model encoding. We extend this work by applying activation patching to Spanish and Chinese datasets and by exploring whether document-length information is encoded in the model as well. Our results confirm that the designed activation patching method can isolate the behavior to specific components and tokens in neural retrieval models. Moreover, our findings indicate that the location of term frequency generalizes across languages and that in later layers, the information for sequence-level tasks is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oliversavolainen/axiomatic-ir-reproduce
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Neural Networks and Applications

MethodsActivation Patching