MalFamAware: Automatic Family Identification and Malware Classification Through Online Clustering
The skyrocketing grow rate of new malware brings novel challenges to protect computers and networks. Discerning truly novel malware from variants of known samples is a way to keep pace with this trend. This can be done by grouping known malware in families by similarity and classifying new samples into those families. As malware and their families evolve over time, approaches based on classifiers trained on a fixed ground truth are not suitable. Other techniques use clustering to identify families but they need to periodically re-cluster the whole set of samples, which does not scale well.