Towards Generic Deobfuscation of Windows API Calls

Vadim Kotov; Michael Wojnowicz

arXiv:1802.04466·cs.CR·December 8, 2020

Towards Generic Deobfuscation of Windows API Calls

Vadim Kotov, Michael Wojnowicz

PDF

1 Repo

TL;DR

This paper presents a static analysis method that uses symbolic execution and hidden Markov models to deobfuscate Windows API calls in malware, achieving high prediction accuracy without executing the code.

Contribution

It introduces a novel static analysis technique combining symbolic execution and HMMs for generic deobfuscation of Windows API calls in malware.

Findings

01

87.60% API name prediction accuracy

02

Effective static alternative to dynamic analysis

03

Handles obfuscated API call patterns

Abstract

A common way to get insight into a malicious program's functionality is to look at which API functions it calls. To complicate the reverse engineering of their programs, malware authors deploy API obfuscation techniques, hiding them from analysts' eyes and anti-malware scanners. This problem can be partially addressed by using dynamic analysis; that is, by executing a malware sample in a controlled environment and logging the API calls. However, malware that is aware of virtual machines and sandboxes might terminate without showing any signs of malicious behavior. In this paper, we introduce a static analysis technique allowing generic deobfuscation of Windows API calls. The technique utilizes symbolic execution and hidden Markov models to predict API names from the arguments passed to the API functions. Our best prediction model can correctly identify API names with 87.60% accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cylance/winapi-deobfuscation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.