Learning Robust Representations for Malicious Content Detection via Contrastive Sampling and Uncertainty Estimation
Elias Hossain, Umesh Biswas, Charan Gudla, and Sai Phani Parsa

TL;DR
This paper introduces the Uncertainty Contrastive Framework (UCF), a novel PU learning approach that enhances malicious content detection by producing robust, discriminative embeddings through uncertainty-aware contrastive learning and adaptive techniques.
Contribution
The paper presents UCF, a new framework combining uncertainty-aware contrastive loss, adaptive temperature scaling, and self-attention-based encoding for improved PU learning in noisy, imbalanced data.
Findings
Achieves over 93.38% accuracy in malicious content classification
Produces embeddings with high precision and near-perfect recall
Demonstrates clear separation of positive and unlabeled instances in visual analysis
Abstract
We propose the Uncertainty Contrastive Framework (UCF), a Positive-Unlabeled (PU) representation learning framework that integrates uncertainty-aware contrastive loss, adaptive temperature scaling, and a self-attention-guided LSTM encoder to improve classification under noisy and imbalanced conditions. UCF dynamically adjusts contrastive weighting based on sample confidence, stabilizes training using positive anchors, and adapts temperature parameters to batch-level variability. Applied to malicious content classification, UCF-generated embeddings enable multiple traditional classifiers to achieve more than 93.38% accuracy, precision above 0.93, and near-perfect recall, with minimal false negatives and competitive ROC-AUC scores. Visual analyses confirm clear separation between positive and unlabeled instances, highlighting the framework's ability to produce calibrated, discriminative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Authorship Attribution and Profiling
