# Transfer Learning for Causal Sentence Detection

**Authors:** Manolis Kyriakakis, Ion Androutsopoulos, Joan Gin\'es i Ametll\'e,, Artur Saudabayev

arXiv: 1906.07544 · 2019-06-21

## TL;DR

This paper investigates the effectiveness of transfer learning methods like ELMO and BERT for detecting causal sentences, finding benefits mainly in small datasets and limited gains with larger datasets.

## Contribution

It introduces a transfer learning approach for causal sentence detection and evaluates its performance across different dataset sizes, including a new biomedical dataset.

## Key findings

- Transfer learning improves performance on small datasets.
- Larger datasets lead to performance plateaus, reducing transfer learning benefits.
- Transfer learning has limited impact on large datasets.

## Abstract

We consider the task of detecting sentences that express causality, as a step towards mining causal relations from texts. To bypass the scarcity of causal instances in relation extraction datasets, we exploit transfer learning, namely ELMO and BERT, using a bidirectional GRU with self-attention (BIGRUATT) as a baseline. We experiment with both generic public relation extraction datasets and a new biomedical causal sentence detection dataset, a subset of which we make publicly available. We find that transfer learning helps only in very small datasets. With larger datasets, BIGRUATT reaches a performance plateau, then larger datasets and transfer learning do not help.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.07544/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1906.07544/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1906.07544/full.md

---
Source: https://tomesphere.com/paper/1906.07544