# Automatically Learning Construction Injury Precursors from Text

**Authors:** Henrietta Baker, Matthew R. Hallowell, Antoine J.-P. Tixier

arXiv: 1907.11769 · 2026-05-21

## TL;DR

This paper compares deep learning NLP models and traditional methods to automatically identify injury precursors from construction accident reports, enhancing safety analysis and understanding.

## Contribution

It introduces a framework for extracting injury precursors from text reports using CNN, HAN, and TF-IDF+SVM, with methods to interpret model predictions.

## Key findings

- Deep learning models effectively identify injury precursors.
- Models provide interpretable insights into safety report texts.
- Proposed approaches outperform traditional methods in precursor detection.

## Abstract

In light of the increasing availability of digitally recorded safety reports in the construction industry, it is important to develop methods to exploit these data to improve our understanding of safety incidents and ability to learn from them. In this study, we compare several approaches to automatically learn injury precursors from raw construction accident reports. More precisely, we experiment with two state-of-the-art deep learning architectures for Natural Language Processing (NLP), Convolutional Neural Networks (CNN) and Hierarchical Attention Networks (HAN), and with the established Term Frequency - Inverse Document Frequency representation (TF-IDF) + Support Vector Machine (SVM) approach. For each model, we provide a method to identify (after training) the textual patterns that are, on average, the most predictive of each safety outcome. We show that among those pieces of text, valid injury precursors can be found. The proposed methods can also be used by the user to visualize and understand the models' predictions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.11769/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/1907.11769/full.md

## References

62 references — full list in the complete paper: https://tomesphere.com/paper/1907.11769/full.md

---
Source: https://tomesphere.com/paper/1907.11769