Mitigating Data Scarcity in Psychological Defense Classification with Context-Aware Synthetic Augmentation

Hoang-Thuy-Duong Vu; Quoc-Cuong Pham; Huy-Hieu Pham

arXiv:2605.14380·cs.CL·May 15, 2026

Mitigating Data Scarcity in Psychological Defense Classification with Context-Aware Synthetic Augmentation

Hoang-Thuy-Duong Vu, Quoc-Cuong Pham, Huy-Hieu Pham

PDF

1 Repo

TL;DR

This paper introduces a context-aware synthetic data augmentation method combined with a hybrid classifier to improve psychological defense mechanism classification from text, especially under data scarcity.

Contribution

It presents a novel augmentation framework and hybrid model that incorporate clinical features and context-aware prompts, advancing low-resource psychological text classification.

Findings

01

Achieved 58.26% accuracy, a 40.25% improvement over baseline.

02

Reached a macro-F1 score of 24.62%, a 15.99% increase.

03

Demonstrated the importance of prompt quality in generation fidelity.

Abstract

Psychological defense mechanisms (PDMs) are unconscious cognitive processes that modulate how individuals perceive and respond to emotional distress. Automatically classifying PDMs from text is clinically valuable but severely hindered by data scarcity and class imbalance, challenges which generative augmentation alone cannot resolve without psychological grounding. In this work, we address these challenges in the PsyDefDetect shared task (BioNLP@ACL 2026) by proposing a context-aware synthetic augmentation framework combined with a hybrid classification model. Our hybrid model integrates contextual language representations with basic clinical features, along with 150 annotated defense items. Experiments demonstrate that definition quality in prompting directly governs generation fidelity and downstream performance. Our method surpasses DMRS Co-Pilot, reaching an accuracy of 58.26%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

htdgv/CASA-PDC
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.