With thousands of malware samples surfacing every day, dynamic monitoring systems play a key role in the automatic characterization and detection of malicious behaviors that undermine the security of computing systems. Modern malware strains, however, adopt reconnaissance techniques on the execution environment, looking for artifacts that are indicative of the presence of a monitoring system and hiding their normally harmful behavior to elude detection and analysis.
In the arms race between malware writers and defenders, researchers and security firms have invested in stealth analysis environments based on virtualization, emulation, or bare-metal execution. Unfortunately, realizing an analysis system that is indistinguishable from a victim machine is in practice impossible due to the discrepancies inevitably introduced by analysis agents. While vendors try to patch imperfections, malware writers regularly find unanticipated ways to expose monitoring systems. The malware analyst is then left with the sole option of manually dissecting an evasive malware sample to understand the employed adversarial technique: a lengthy and complex process.
We propose an innovative methodology for identifying unforeseen evasion strategies and aiding their disarm. We build on a simple intuition: while a sample may adopt different strategies for different monitoring systems, it still features identifiable patterns that characterize the points where environmental checks lead to evasion decisions. We plan to use data-flow analysis to fully understand how already-known checks are carried out; the gained information will then guide tainting and fuzzing techniques for discovering the novel fingerprinting checks in that sample.
Our tools will aid the productivity of analysts when dismantling unforeseen evasions, and contribute to patching monitoring systems by providing a methodology to point out the underpinnings of adversarial techniques meant for their detection.
[Innovation]
The proposed project could potentially contribute to rapid advancements in the field of malware analysis. In particular, designing emulated and virtualized sandboxes that remain robust against ever-evolving evasion strategies is an open challenge in the security industry. A thorough understanding of the anti-analysis techniques employed by malware constitutes the first indispensable step in the fight against sandbox evasion attacks, since the gathered insights can be in turn used to harden sandboxing solutions, thus making them more robust with respect to anti-analysis behaviour.
Moreover, the identification and the understanding of evasive checks is currently a manual process performed by analysts; such a process is not scalable as it is extremely time-consuming and resource-intensive. This clashes with the need for large scale analysis, which is required in order to keep up with the staggering amount of new malware samples surfacing every day. In light of these considerations, automatic ways of identifying and disarming novel evasive strategies are clearly needed.
A first step towards the automatic identification of malware evasion patterns has been carried out by [KV15], which proposes a system aimed at extracting evasion signatures from evasive malware. Such a system makes use of data mining and data-flow analysis techniques to automate the signatures¿ extraction as well as leveraging algorithms borrowed from bioinformatics to detect evasive behaviour in system call sequences. Although [KV15] has the merit of being the first and only work tackling the problem of automatically identifying evasion schemes, it suffers from a number of severe limitations that hinder its effectiveness and its applicability in practice.
Our ideas go one step further since they address these problems in a very practical manner, showing that the current project could potentially provide important contributions towards the advancement of the state of the art in the malware analysis research and practice. In [KV15], evasion patterns are extracted by looking at the difference in system call sequences when executing each sample in two different environments: one in which the malware instance evades analysis and one in which the sample reveals its malicious behaviour. This approach is unlikely to be effective if the malware sample under analysis is equipped with anti-analysis techniques targeting each of the employed execution environments, especially if the checks are combined, as in this scenario we would observe no difference in the invoked system calls. Our approach does not suffer from such issues as we force a malware sample to exhibit anti-analysis behaviour against environments of our choice on-demand through the use of BluePill. On top of that, [KV15]¿s strategy involves the execution of a malware instance on multiple environments (e.g., QEMU, VirtualBox), making the whole analysis process cumbersome and heavy-duty. Instead, we use lighter program analyses and setups, as we require only a virtual machine equipped with BluePill, thus making our strategy easier and less costly to incorporate for security firms willing to adopt this approach. Moreover, [KV15]¿s evasion signatures are based solely on system calls while we go deeper since we characterize each evasion pattern using the involved instruction sequences and build profiles from tainted memory locations and invoked functions. Lastly, [KV15] is vulnerable to time-stalling strategies while we are immune to them due to BluePill¿s ability to selectively massage the outcome of these types of checks.
[Scientific Impact]
We hope to disseminate the results achieved by the present research project via publications in top-ranked security conferences (such as IEEE S&P, ACM CCS, NDSS, and USENIX Security) and journals (such as IEEE TIFS, IEEE TDSC, and ACM TOPS).
In fact, in recent years, the problem of evasive malware has been a very topical one in academia, with flagship venues showing clear interest in works addressing this issue.
Moreover, given the rising popularity of the topic we investigate, we are ready to share our ideas with other research groups from all over the world, hoping to establish new successful collaborations like the one currently in place with King¿s College London for this topic.
Additional References
[KV15] Kirat et al. MalGene: Automatic Extraction of Malware Analysis Evasion Signature. CCS, 2015.
[KVK14] Kirat et al. BareCloud: Bare-metal Analysis-based Evasive Malware Detection. USENIX Security, 2014.