YASA: Scalable Multi-Language Taint Analysis on the Unified AST at Ant Group

Yayi Wang; Shenao Wang; Jian Zhao; Shaosen Shi; Ting Li; Yan Cheng; Lizhong Bian; Kan Yu; Yanjie Zhao; Haoyu Wang

arXiv:2601.17390·cs.SE·April 3, 2026

YASA: Scalable Multi-Language Taint Analysis on the Unified AST at Ant Group

Yayi Wang, Shenao Wang, Jian Zhao, Shaosen Shi, Ting Li, Yan Cheng, Lizhong Bian, Kan Yu, Yanjie Zhao, Haoyu Wang

PDF

TL;DR

YASA is a scalable multi-language static taint analysis framework using a unified AST, enabling effective security testing across diverse programming languages in large industrial applications.

Contribution

YASA introduces a unified abstract syntax tree for multi-language taint analysis, improving scalability, analysis precision, and extensibility for industrial-scale security testing.

Findings

01

YASA outperforms existing analyzers on industry-standard benchmarks.

02

Analyzed over 100 million lines of code in real-world deployment.

03

Identified 314 previously unknown taint paths, including 92 confirmed 0-day vulnerabilities.

Abstract

Modern enterprises increasingly adopt diverse technology stacks with various programming languages, posing significant challenges for static application security testing (SAST). Existing taint analysis tools are predominantly designed for single languages, requiring substantial engineering effort that scales with language diversity. While multi-language tools like CodeQL, Joern, and WALA attempt to address these challenges, they face limitations in intermediate representation design, analysis precision, and extensibility, which make them difficult to scale effectively for large-scale industrial applications at Ant Group. To bridge this gap, we present YASA (Yet Another Static Analyzer), a unified multi-language static taint analysis framework designed for industrial-scale deployment. Specifically, YASA introduces the Unified Abstract Syntax Tree (UAST) that provides a unified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.