Text-based classification of interviews for mental health -- juxtaposing the state of the art
Joppe Valentijn Wouts

TL;DR
This paper develops a Dutch language model, belabBERT, to improve text-based classification of psychiatric interviews, demonstrating its competitiveness against audio-based methods and exploring hybrid approaches.
Contribution
Introduction of belabBERT, a Dutch language model trained on a large corpus, and evaluation of text-based classification's potential in psychiatric diagnosis.
Findings
belabBERT improves Dutch NLP tasks
Text-based classification competes with audio-based methods
Hybrid models show promise for future research
Abstract
Currently, the state of the art for classification of psychiatric illness is based on audio-based classification. This thesis aims to design and evaluate a state of the art text classification network on this challenge. The hypothesis is that a well designed text-based approach poses a strong competition against the state-of-the-art audio based approaches. Dutch natural language models are being limited by the scarcity of pre-trained monolingual NLP models, as a result Dutch natural language models have a low capture of long range semantic dependencies over sentences. For this issue, this thesis presents belabBERT, a new Dutch language model extending the RoBERTa[15] architecture. belabBERT is trained on a large Dutch corpus (+32GB) of web crawled texts. After this thesis evaluates the strength of text-based classification, a brief exploration is done, extending the framework to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Topic Modeling · Sentiment Analysis and Opinion Mining
