A comprehensive empirical analysis on cross-domain semantic enrichment for detection of depressive language
Nawshad Farruque, Randy Goebel, Osmar Zaiane

TL;DR
This paper presents a method for enhancing word embeddings with domain-specific information to improve depression detection in Tweets, demonstrating significant performance gains over traditional models.
Contribution
It introduces a novel augmentation approach combining general and domain-specific embeddings using various mapping techniques for depression detection.
Findings
Augmented embeddings outperform baseline models in F1 score.
Auto-encoder based mapping improves semantic relevance.
Data ablation confirms effectiveness of augmentation methods.
Abstract
We analyze the process of creating word embedding feature representations designed for a learning task when annotated data is scarce, for example, in depressive language detection from Tweets. We start with a rich word embedding pre-trained from a large general dataset, which is then augmented with embeddings learned from a much smaller and more specific domain dataset through a simple non-linear mapping mechanism. We also experimented with several other more sophisticated methods of such mapping including, several auto-encoder based and custom loss-function based methods that learn embedding representations through gradually learning to be close to the words of similar semantics and distant to dissimilar semantics. Our strengthened representations better capture the semantics of the depression domain, as it combines the semantics learned from the specific domain coupled with word…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Sentiment Analysis and Opinion Mining · Topic Modeling
