# Causality and signalling of garden-path sentences

**Authors:** Daphne Wang, Mehrnoosh Sadrzadeh

PMC · DOI: 10.1098/rsta.2023.0013 · Philosophical transactions. Series A, Mathematical, physical, and engineering sciences · 2024-01-29

## TL;DR

The paper uses mathematical presheaves and BERT probabilities to model garden-path sentences and improve predictions of human parsing behavior.

## Contribution

A new method for modeling syntactic ambiguities using presheaves and BERT probabilities that outperforms surprisal in predicting garden-path effects.

## Key findings

- The degree of signaling distinguishes between hard and easy garden-path sentences with high statistical significance.
- The garden-path effect is larger in one dataset, improving prediction accuracy compared to surprisal.
- BERT probabilities enhance the modeling of syntactic ambiguities in natural language processing.

## Abstract

Sheaves are mathematical objects that describe the globally compatible data associated with open sets of a topological space. Original examples of sheaves were continuous functions; later they also became powerful tools in algebraic geometry, as well as logic and set theory. More recently, sheaves have been applied to the theory of contextuality in quantum mechanics. Whenever the local data are not necessarily compatible, sheaves are replaced by the simpler setting of presheaves. In previous work, we used presheaves to model lexically ambiguous phrases in natural language and identified the order of their disambiguation. In the work presented here, we model syntactic ambiguities and study a phenomenon in human parsing called garden-pathing. It has been shown that the information-theoretic quantity known as ‘surprisal’ correlates with human reading times in natural language but fails to do so in garden-path sentences. We compute the degree of signalling in our presheaves using probabilities from the large language model BERT and evaluate predictions on two psycholinguistic datasets. Our degree of signalling outperforms surprisal in two ways: (i) it distinguishes between hard and easy garden-path sentences (with a p-value <10−5), whereas existing work could not, (ii) its garden-path effect is larger in one of the datasets (32 ms versus 8.75 ms per word), leading to better prediction accuracies.

This article is part of the theme issue ‘Quantum contextuality, causality and freedom of choice’.

## Full-text entities

- **Diseases:** NP (MESH:D001321), MS (MESH:D009103)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10822712/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10822712/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/PMC10822712/full.md

---
Source: https://tomesphere.com/paper/PMC10822712