Abstract:
Recent years have seen a rise in state-sponsored malware. Advanced Persistent Threat groups (APTs) have been waging a covert war with little repercussions due to the clandestine nature of cyberconflict. For sanctions to be imposed, malware attribution is an important stage in the attack analysis because the exploitation of known vulnerabilities via malware execution is one of the methods APT attackers use to establish a foothold in the target’s network. Prior attempts at automated attack attribution use behaviour report from sandboxes as inputs into machine learning algorithms. Whilst this is a good and reliable approach, it has some limitations. For example, some APT files may detect that they are in a sandboxed environment and stop execution or behave differently leading to false or no attributions. Hence, there is a need for an alternative feature extraction technique for attack attribution. This research proposes a novel framework for a lightweight and automated attack attribution that uses fuzzy hashes as natural language input for machine learning classifiers to attribute attacks. Experimental results show that the proposed framework attributes attack with average accuracy and F1-score of 89% and 87.5% for the country and APT group classification. In addition, we demonstrate how the proposed approach provides a faster and lightweight method of attack attribution, enhances advanced analysis of samples, and generates competitive performance with state-of-the-art dynamic analysis attribution engines.