Investigating Graph Embedding Neural Networks with Unsupervised Features Extraction for Binary Analysis

2019

Proceedings BAR 2019 Workshop on Binary Analysis Research

Investigating Graph Embedding Neural Networks with Unsupervised Features Extraction for Binary Analysis

04 Pubblicazione in atti di convegno

Massarelli Luca, DI LUNA GIUSEPPE ANTONIO, Petroni Fabio, Querzoni Leonardo, Baldoni Roberto

In this paper we investigate the use of graph embedding networks, with unsupervised features learning, as neural architecture to learn over binary functions.

We propose several ways of automatically extract features from the control ﬂow graph (CFG) and we use the structure2vec graph embedding techniques to translate a CFG to a vectors of real numbers. We train and test our proposed architectures on two different binary analysis tasks: binary similarity, and, compiler provenance. We show that the unsupervised extraction of features improves the accuracy on the above tasks, when compared with embedding vectors obtained from a CFG annotated with manually engineered features (i.e., ACFG proposed in [39]).

We additionally compare the results of graph embedding networks based techniques with a recent architecture that do not make use of the structural information given by the CFG, and we observe similar performances. We formulate a possible explanation of this phenomenon and we conclude identifying important open challenges.

Binary Analysis Binary Similarity