Machine Learning Intrusion Detection in Big Data Era: A Multi-Objective Approach for Longer Model Lifespans

Machine Learning Intrusion Detection in Big Data Era: A Multi-Objective Approach for Longer Model Lifespans

Abstract:

Despite highly accurate intrusion detection schemes based on machine learning (ML) reported in the literature, changes in network traffic behavior quickly yield low accuracy rates. An intrusion detection model update is not easily feasible due to the enormous amount of network traffic to be processed in near real-time for high-speed networks, in particular, under big data settings. In this paper, we propose a new scalable long-lasting intrusion detection architecture for the processing of network content and the building of a reliable ML-based intrusion detection model. Experiments performed through the analysis of five years of network traffic, about 20 TB of data, have shown that our approach extends the lifespan of our model by up to six weeks. That occurs because the average accuracy rate of our proposal lasted eight weeks after the training phase, and traditional ones reached only two weeks after the model building. Additionally, our proposal achieves up to 10 Gbps of detection throughput in a 20-core big data processing cluster.