Superset Decompilation
Chang Liu, Yihao Sun, Thomas Gilray, Kristopher Micinski

TL;DR
This paper introduces PGSD, a modular, provenance-guided decompilation framework that improves reverse engineering by retaining ambiguous interpretations and formalizing the process as a sequence of interpretive passes.
Contribution
It formalizes superset decompilation as a provenance-guided, modular pipeline, enabling more flexible and accurate reverse engineering compared to monolithic tools.
Findings
Manifold matches Ghidra, IDA Pro, angr, and RetDec in output quality.
Manifold produces fewer compiler errors.
The framework generalizes across different compilers and optimization levels.
Abstract
Reverse engineering tools remain monolithic and imperative compared to the advancement of modern compiler architectures: analyses are tied to a single mutable representation, making them difficult to extend or refine, and forcing premature choices between soundness and precision. We observe that decompilation is the reverse of compilation and can be structured as a sequence of modular passes, each performing a granular and clearly defined interpretation of the binary at a progressively higher level of abstraction. We formalize this as provenance-guided superset decompilation (PGSD), a framework that monotonically derives facts about the binary into a relation store. Instead of committing early to a single interpretation, the pipeline retains ambiguous interpretations as parallel candidates with provenance, deferring resolution until the final selection phase. Manifold implements PGSD as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
