Abstract:
Deep learning has recently resulted in remarkable performance improvements in machine fault diagnosis using only raw input vibration signals without signal preprocessing. However, research on machine fault diagnosis using deep learning has primarily focused on model architectures, even though optimizers and their hyperparameters used for training can have a significant impact on model performance. This paper presents extensive benchmarking results on the tuning of optimizer hyperparameters using various combinations of datasets, convolutional neural network (CNN) models, and optimizers with varying batch sizes. First, we set the hyperparameter search space and then trained the models using hyperparameters sampled from a quasi-random distribution. Subsequently, we refined the search space based on the results of the first step and finally evaluated model performances using noise-free and noisy data. The results showed that the learning rate and momentum factor, which determine training speed, substantially affected the model’s accuracy. We also discovered that the impacts of batch size and model training speed on model performance were highly correlated; large batch sizes led to higher performances at higher learning rates or momentum factors. Conversely, model performances tended to be high for small batch sizes at lower learning rates or momentum factors. In addition, regarding the growing attention to on-device artificial intelligence (AI) solutions, we assessed the accuracy and computational efficiency of candidate models. A CNN with training interference (TICNN) was the most efficient model in terms of computational efficiency and robustness against noise among the benchmarked candidate models.