Predicting Themes within Complex Unstructured Texts: A Case Study on Safeguarding Reports
Aleksandra Edwards, David Rogers, Jose Camacho-Collados, H\'el\`ene de, Ribaupierre, Alun Preece

TL;DR
This paper explores how deep learning can effectively classify complex, domain-specific safeguarding reports with limited labeled data, demonstrating potential for automated theme detection in sensitive contexts.
Contribution
It investigates the application of deep learning to small, specialized datasets, providing insights into effective combinations of methods for such challenging scenarios.
Findings
Deep learning models can simulate expert judgment with limited data.
Supervised classification effectively identifies main themes in safeguarding reports.
Potential for automating complex, domain-specific text analysis.
Abstract
The task of text and sentence classification is associated with the need for large amounts of labelled training data. The acquisition of high volumes of labelled datasets can be expensive or unfeasible, especially for highly-specialised domains for which documents are hard to obtain. Research on the application of supervised classification based on small amounts of training data is limited. In this paper, we address the combination of state-of-the-art deep learning and classification methods and provide an insight into what combination of methods fit the needs of small, domain-specific, and terminologically-rich corpora. We focus on a real-world scenario related to a collection of safeguarding reports comprising learning experiences and reflections on tackling serious incidents involving children and vulnerable adults. The relatively small volume of available reports and their use of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
