Machine Learning (ML) widespread adoption heavily altered the computer science industry landscape in recent years. Novel advanced algorithms and cheap, powerful hardware make it easy even for smaller companies to adopt ML-based solutions and propose ML models in widespread consumer goods such as smartphones. The success of machine learning is undeniable: ML models are already able to outperform humans in various tasks such as image recognition, voice recognition and complex tasks such as chess and go games. On the back of these successes, ML - and Neural Networks (NN) in particular - are currently being applied to critical applications such as automated driving, autonomous systems (e.g., drones, infrastructure monitoring) and cybersecurity.
Whilst the benefits of ML are clear, there is still uncertainty related to the security and suitability of applying ML techniques in critical applications. Indeed, researchers have repeatedly shown that ML and NN in particular are subject to a wide variety of attacks, with effects ranging from reduced effectiveness to complete destabilization of the system the NN is managing. These adversarial attacks can force models to misclassify inputs resulting in wrong predictions or recognition (e.g., a car misclassifying a pedestrian for a shadow), leak private information on users (e.g., in collaborative learning scenarios), poison the model and introduce triggers that allow for stealth attacks (e.g, poisoning of a ML-based cybersecurity system), and much more.
Given the increasingly widespread adoption of ML in critical systems and our lack of understanding of its security in several key application scenarios, additional research in this direction is needed. This project aims to investigate the applicability of Neural Networks to critical systems such as cybersecurity, while taking into consideration their robustness to attacks.
The research proposed by this project has considerable potential for real-life impact, given the widespread adoption of ML-based techniques across different sectors of computer science. Poisoning attacks and evasion attacks can have profound consequences when used against critical applications such as ML malware detection or self-driving cars. Currently published research suffers from several drawbacks and limitations. For machine learning poisoning attacks and trojaning, most detection techniques currently proposed rely on the concept of locality: patches used to trojan samples are located in a constrained area of the input. Countermeasures such as those proposed by Chou et al. [33] rely on this locality constraint to detect trojaned samples based on saliency maps. However, this can be avoided by relying on dynamic, non local perturbations rather than using a static patch as a trojan trigger. In the context or evasion attacks, such as [12,16], while extremely effective, alter input data to achieve evasion without considering potential behavioural requirements. For instance, if we consider evasion of an ML-based malware detector, there are constraints regarding which modifications are allowed on the input. In particular, since the goal is to modify a malware in such a way that it classified as benign by the ML detector, the main constraint is that modifications to the malware do not alter its intended behaviour (i.e., it still performs its malware tasks). Current state-of-the-art evasion techniques and their countermeasures do not consider such constraints on the input, with the exception of [15]. However, even in [15] the behaviour is modeled in a simplistic way and is only applicable to the restricted context considered in the paper. The general problem of ML-based malware detection evasion is still an open research question, as well as the design of appropriate countermeasures for such attacks. More complex evasion attacks such as those proposed by De Gaspari et al. in [27] show that it is possible to maintain malicious functionality while evading classification. However, it appears feasible to use adversarial training to strengthen the classifiers against at least some of the attacks proposed by the authors. Indeed, some of the evasion techniques proposed in [27] introduce some peculiar behaviours in the malware, which can be exploited to detect the modified evasive malware.
In this project, I aim to explore several promising areas related to adversarial attacks. I plan on build on top of our previous work [27] in order to propose a new model that is more resilient against the evasion attacks we proposed. In particular, I believe that it is possible to exploit some peculiar features that the evasion techniques in [27] introduce to make ML classifiers more robust against these types of attacks. Process split and functional split malware processes (defined in [27]) both introduce a behavior that is atypical and different from that of benign processes. I plan to exploit this asymmetry and show that adversarial training is sufficient to make a ML model robust against these types of attack. Moreover, I plan to investigate the applicability of Invertible Neural Networks (INN) to the context of cybersecurity. Invertible neural networks are a family of neural networks that provide a direct and unique mapping between an input sample, and a latent point output by the INN. INNs are trained to learn the probability distribution associated with a given dataset, and provide an immediate and easy way to compute the probability density of an arbitrary point (that is not part of the dataset). These characteristics make INN potential candidates in applications such as poisoning detection, as well as many other cybsersecurity-related applications. Poisoned samples are generally the vast minority in the dataset, since the general assumption in the literature is that adversary can only influence a portion of the dataset used for training. Therefore, it might be possible to use outlier detection techniques coupled with INN probability estimation in order to detect poisoned samples, since they should reside in regions of low probability density.