LLM-Driven Adaptive Source-Sink Identification and False Positive Mitigation for Static Analysis
Shiyin Lin

TL;DR
AdaTaint is an LLM-driven static analysis framework that adaptively infers source-sink specifications and filters false positives through neuro-symbolic reasoning, significantly improving accuracy and reducing false alarms.
Contribution
It introduces a novel neuro-symbolic approach combining LLM inference with program facts for adaptive source-sink identification and false positive mitigation in static analysis.
Findings
Reduces false positives by 43.7% on average
Improves recall by 11.2% over state-of-the-art methods
Maintains competitive runtime overhead
Abstract
Static analysis is effective for discovering software vulnerabilities but notoriously suffers from incomplete source--sink specifications and excessive false positives (FPs). We present \textsc{AdaTaint}, an LLM-driven taint analysis framework that adaptively infers source/sink specifications and filters spurious alerts through neuro-symbolic reasoning. Unlike LLM-only detectors, \textsc{AdaTaint} grounds model suggestions in program facts and constraint validation, ensuring both adaptability and determinism. We evaluate \textsc{AdaTaint} on Juliet 1.3, SV-COMP-style C benchmarks, and three large real-world projects. Results show that \textsc{AdaTaint} reduces false positives by \textbf{43.7\%} on average and improves recall by \textbf{11.2\%} compared to state-of-the-art baselines (CodeQL, Joern, and LLM-only pipelines), while maintaining competitive runtime overhead. These findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Security and Verification in Computing
