Multilevel Partitioning of Neural Networks for Constrained Internet of Things Devices

Multilevel Partitioning of Neural Networks for Constrained Internet of Things Devices

Abstract:

The increasing number of Internet-of-Things (IoT) devices will generate unprecedented data in the upcoming years. Fog computing may prevent the saturation of the network infrastructure by processing data at the edge or within these devices. Consequently, the machine intelligence built almost exclusively on the cloud can be scattered to the edge devices. While deep learning techniques can adequately process IoT-massive data volumes, their high resource-demanding nature poses a trade-off for execution on resource-constrained devices. This paper proposes and evaluates the performance of the PArtitioning Networks for COnstrained DEvices (PANCODE), a novel algorithm that employs a multilevel approach to partition large convolutional neural networks for distributed execution on constrained IoT devices. Experimental results with the LeNet and AlexNet models show that our algorithm can produce partitionings that achieve up to 2173.53 times more inferences per second than the Best Fit algorithm and up to 1.37 times less communication than the second-best approach. We also show that the METIS state-of-the-art framework only produces invalid partitionings in more constrained setups. The results indicate that our algorithm achieves higher inference rates and low communication costs in convolutional neural networks distributed among constrained and exceptionally very constrained devices.