Abstract:
Security has become a critical issue for Industry4.0 due to different emerging cyber-security threats. Recently, many deep learning (DL) approaches have focused on intrusion detection. However, such approaches often require sending data to a central entity. This in turn raises concerns related to privacy, efficiency, and latency. Despite the huge amount of data generated by the Internet of Things (IoT) devices in Industry 4.0, it is difficult to get labeled data, because data labeling is costly and time-consuming. This poses many challenges for several DL approaches, which require labeled data. In order to deal with these issues, new approaches should be adopted. This article proposes a novel federated semisupervised learning scheme that takes advantage of both unlabeled and labeled data in a federated way. First, an autoencoder (AE) is trained on each device (using unlabeled local/private data) to learn the representative and low-dimensional features. Then, a cloud server aggregates these models into a global AE using federated learning (FL). Finally, the cloud server composes a supervised neural network, by adding fully connected layers (FCN) to the global encoder (the first part of the global AE) and trains the resulting model using publicly available labeled data. Extensive case studies on two real-world industrial datasets demonstrate that our model: (a) ensures that no local private data is exchanged; (b) detects attacks with high classification performance, (c) works even when only a few amounts of labeled data are available; and (d) haslow communication overhead.