Secret Leak Detection in Software Issue Reports using LLMs: A Comprehensive Evaluation

Sadif Ahmed; Md Nafiu Rahman; Zahin Wahab; Gias Uddin; Rifat Shahriyar

arXiv:2410.23657·cs.SE·April 17, 2026

Secret Leak Detection in Software Issue Reports using LLMs: A Comprehensive Evaluation

Sadif Ahmed, Md Nafiu Rahman, Zahin Wahab, Gias Uddin, Rifat Shahriyar

PDF

TL;DR

This paper presents a comprehensive evaluation of secret leak detection in GitHub issue reports using LLMs, introducing a new benchmark dataset and a hybrid detection pipeline that outperforms prior methods.

Contribution

It introduces a large-scale benchmark dataset and a hybrid detection pipeline combining regex and LLMs for effective secret leak detection in issue reports.

Findings

01

Regex and entropy methods have high recall but low precision.

02

Open-source LLMs like Qwen and LLaMA achieve up to 94.49% F1 score.

03

The approach generalizes well to real-world GitHub repositories.

Abstract

In the digital era, accidental exposure of sensitive information such as API keys, tokens, and credentials is a growing security threat. While most prior work focuses on detecting secrets in source code, leakage in software issue reports remains largely unexplored. This study fills that gap through a large-scale analysis and a practical detection pipeline for exposed secrets in GitHub issues. Our pipeline combines regular expression-based extraction with large language model (LLM)-based contextual classification to detect real secrets and reduce false positives. We build a benchmark of 54,148 instances from public GitHub issues, including 5,881 manually verified true secrets. Using this dataset, we evaluate entropy-based baselines and keyword heuristics used by prior secret detection tools, classical machine learning, deep learning, and LLM-based methods. Regex and entropy based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.