SIExVulTS: Sensitive Information Exposure Vulnerability Detection System using Transformer Models and Static Analysis
Kyler Katz, Sara Moshtari, Ibrahim Mujhid, Mehdi Mirakhorli, Derek Garcia

TL;DR
SIExVulTS is a novel system combining transformer models and static analysis to detect and verify sensitive information exposure vulnerabilities in Java applications, outperforming existing tools and discovering new CVEs.
Contribution
This paper introduces SIExVulTS, integrating transformer-based models with static analysis for comprehensive CWE-200 vulnerability detection and verification in Java code.
Findings
Attack Surface Detection achieved >93% F1 score
Flow Verification increased precision from 22.61% to 87.23%
Successfully uncovered six new CVEs in Apache projects
Abstract
Sensitive Information Exposure (SIEx) vulnerabilities (CWE-200) remain a persistent and under-addressed threat across software systems, often leading to serious security breaches. Existing detection tools rarely target the diverse subcategories of CWE-200 or provide context-aware analysis of code-level data flows. Aims: This paper aims to present SIExVulTS, a novel vulnerability detection system that integrates transformer-based models with static analysis to identify and verify sensitive information exposure in Java applications. Method: SIExVulTS employs a three-stage architecture: (1) an Attack Surface Detection Engine that uses sentence embeddings to identify sensitive variables, strings, comments, and sinks; (2) an Exposure Analysis Engine that instantiates CodeQL queries aligned with the CWE-200 hierarchy; and (3) a Flow Verification Engine that leverages GraphCodeBERT to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
