Detecting Call Graph Unsoundness without Ground Truth
Fangtian Zhong, Ollie Wold, Joseph Windmann

TL;DR
This paper empirically investigates the semantic inconsistencies and unsoundness in Java static analysis frameworks, revealing that increased precision can introduce new errors and that different frameworks have incompatible semantics.
Contribution
It provides a large-scale empirical study showing that modern language features and configuration interactions cause analysis inconsistencies and that frameworks have fundamentally incompatible semantics.
Findings
Precision orders often break within frameworks due to language features.
Configuration choices interact with algorithms, causing synergistic failures.
Different frameworks operate over incompatible notions of call-graph ground truth.
Abstract
Java static analysis frameworks are commonly compared under the assumption that analysis algorithms and configurations compose monotonically and yield semantically comparable results across tools. In this work, we show that this assumption is fundamentally flawed. We present a large-scale empirical study of semantic consistency within and across four widely used Java static analysis frameworks: Soot, SootUp, WALA, and Doop. Using precision partial orders over analysis algorithms and configurations, we systematically identify violations where increased precision introduces new call-graph edges or amplifies inconsistencies. Our results reveal three key findings. First, algorithmic precision orders frequently break within frameworks due to modern language features such as lambdas, reflection, and native modeling. Second, configuration choices strongly interact with analysis algorithms,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
