Learning Robust Representations for Malicious Content Detection via Contrastive Sampling and Uncertainty Estimation

Elias Hossain; Umesh Biswas; Charan Gudla; and Sai Phani Parsa

arXiv:2512.08969·cs.LG·December 11, 2025

Learning Robust Representations for Malicious Content Detection via Contrastive Sampling and Uncertainty Estimation

Elias Hossain, Umesh Biswas, Charan Gudla, and Sai Phani Parsa

PDF

Open Access

TL;DR

This paper introduces the Uncertainty Contrastive Framework (UCF), a novel PU learning approach that enhances malicious content detection by producing robust, discriminative embeddings through uncertainty-aware contrastive learning and adaptive techniques.

Contribution

The paper presents UCF, a new framework combining uncertainty-aware contrastive loss, adaptive temperature scaling, and self-attention-based encoding for improved PU learning in noisy, imbalanced data.

Findings

01

Achieves over 93.38% accuracy in malicious content classification

02

Produces embeddings with high precision and near-perfect recall

03

Demonstrates clear separation of positive and unlabeled instances in visual analysis

Abstract

We propose the Uncertainty Contrastive Framework (UCF), a Positive-Unlabeled (PU) representation learning framework that integrates uncertainty-aware contrastive loss, adaptive temperature scaling, and a self-attention-guided LSTM encoder to improve classification under noisy and imbalanced conditions. UCF dynamically adjusts contrastive weighting based on sample confidence, stabilizes training using positive anchors, and adapts temperature parameters to batch-level variability. Applied to malicious content classification, UCF-generated embeddings enable multiple traditional classifiers to achieve more than 93.38% accuracy, precision above 0.93, and near-perfect recall, with minimal false negatives and competitive ROC-AUC scores. Visual analyses confirm clear separation between positive and unlabeled instances, highlighting the framework's ability to produce calibrated, discriminative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Authorship Attribution and Profiling