Early Linguistic Pattern of Anxiety from Social Media Using Interpretable Linguistic Features: A Multi-Faceted Validation Study with Author-Disjoint Evaluation
Arnab Das Utsa

TL;DR
This study develops an interpretable, linguistically grounded model for detecting anxiety from social media posts, validated across multiple datasets and demonstrating robustness and early detection capabilities.
Contribution
It introduces a transparent, linguistically interpretable approach for social media-based anxiety detection with rigorous validation and robustness testing.
Findings
High accuracy maintained after sentiment removal
Early detection significantly outperforms random chance
Model generalizes well across domains and aligns with clinical data
Abstract
Anxiety affects hundreds of millions of individuals globally, yet large-scale screening remains limited. Social media language provides an opportunity for scalable detection, but current models often lack interpretability, keyword-robustness validation, and rigorous user-level data integrity. This work presents a transparent approach to social media-based anxiety detection through linguistically interpretable feature-grounded modeling and cross-domain validation. Using a substantial dataset of Reddit posts, we trained a logistic regression classifier on carefully curated subreddits for training, validation, and test splits. Comprehensive evaluation included feature ablation, keyword masking experiments, and varying-density difference analyses comparing anxious and control groups, along with external validation using clinically interviewed participants with diagnosed anxiety disorders.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Sentiment Analysis and Opinion Mining · Digital Mental Health Interventions
