The Strengths and Behavioral Quirks of Java Bytecode Decompilers

Nicolas Harrand; C\'esar Soto-Valero; Martin Monperrus; Benoit; Baudry

arXiv:1908.06895·cs.SE·December 19, 2019

The Strengths and Behavioral Quirks of Java Bytecode Decompilers

Nicolas Harrand, C\'esar Soto-Valero, Martin Monperrus, Benoit, Baudry

PDF

1 Repo

TL;DR

This paper evaluates eight Java decompilers on real-world software, revealing that no single tool perfectly reconstructs source code, with the best achieving 84% syntactic correctness and 78% semantic equivalence.

Contribution

It provides a comprehensive empirical analysis of Java decompilers' effectiveness across multiple quality metrics using a large benchmark dataset.

Findings

01

No decompiler handles all bytecode structures correctly.

02

Highest-ranked decompiler achieves 84% syntactic correctness.

03

Semantic equivalence is achieved for 78% of classes.

Abstract

During compilation from Java source code to bytecode, some information is irreversibly lost. In other words, compilation and decompilation of Java code is not symmetric. Consequently, the decompilation process, which aims at producing source code from bytecode, must establish some strategies to reconstruct the information that has been lost. Modern Java decompilers tend to use distinct strategies to achieve proper decompilation. In this work, we hypothesize that the diverse ways in which bytecode can be decompiled has a direct impact on the quality of the source code produced by decompilers. We study the effectiveness of eight Java decompilers with respect to three quality indicators: syntactic correctness, syntactic distortion and semantic equivalence modulo inputs. This study relies on a benchmark set of 14 real-world open-source software projects to be decompiled (2041 classes in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

castor-software/decompilercmp
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.