Fine-Grained Static Detection of Obfuscation Transforms Using   Ensemble-Learning and Semantic Reasoning

Ramtine Tofighi-Shirazi (TL); Irina Mariuca Asavoae (TL); Philippe; Elbaz-Vincent (IF)

arXiv:1911.07523·cs.CL·November 19, 2019

Fine-Grained Static Detection of Obfuscation Transforms Using Ensemble-Learning and Semantic Reasoning

Ramtine Tofighi-Shirazi (TL), Irina Mariuca Asavoae (TL), Philippe, Elbaz-Vincent (IF)

PDF

TL;DR

This paper introduces a static detection framework combining semantic reasoning and ensemble learning to identify multiple layers and constructions of obfuscation in software, achieving high accuracy without prior knowledge of functionality.

Contribution

It presents a novel fine-grained, multi-layer obfuscation detection method that does not rely on training-set functionality, extending capabilities beyond existing approaches.

Findings

01

Achieves up to 91% accuracy on obfuscators like Tigress and OLLVM.

02

Detects obfuscation constructions with up to 100% accuracy.

03

Provides best practices for scalable machine learning-based detection.

Abstract

The ability to efficiently detect the software protections used is at a prime to facilitate the selection and application of adequate deob-fuscation techniques. We present a novel approach that combines semantic reasoning techniques with ensemble learning classification for the purpose of providing a static detection framework for obfuscation transformations. By contrast to existing work, we provide a methodology that can detect multiple layers of obfuscation, without depending on knowledge of the underlying functionality of the training-set used. We also extend our work to detect constructions of obfuscation transformations, thus providing a fine-grained methodology. To that end, we provide several studies for the best practices of the use of machine learning techniques for a scalable and efficient model. According to our experimental results and evaluations on obfuscators such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.