Influence of Data Balancing on Transformer DGA Fault Classification With Machine Learning Algorithms

Influence of Data Balancing on Transformer DGA Fault Classification With Machine Learning Algorithms

Abstract:

The application of artificial intelligence algorithms for transformer incipient fault classification using dissolved gas analysis (DGA) is an interesting engineering approach. However, there are various factors that affect the performance of artificial intelligence algorithms. This article presents the influence of the data balancing approach on transformer DGA fault classification with the machine learning (ML) approach. In this work, a total of 4580 DGA samples from in-service transformers are considered for training various ML models. The main challenge for the DGA problem lies in the availability of the normal degradation transformer data and its uniformity corresponding to different faults is almost impossible. This is because DGA is not an exact science, but an empirical approach subjects to variability. Thus, it is a usual practice to apply data sampling techniques that largely influence the efficiency of the algorithms. The present work reports the impact of the data balancing schemes on the performance of the fault classification and demonstrates that a careful choice of the data sampling method and ML algorithm is essential for DGA problems. To demonstrate the global scale ability of the propose model, the model is tested on the IEC TC ten data (field inspection data), while these data are not exposed to the machine during the learning stage. The ability of the ADASYN method in significant enhancement of the global scale capability of AI-based transformer DGA fault classification is reported. This approach will be helpful for condition monitoring engineers in transformer insulation diagnosis in implementing the monitoring modules for large transformer fleets and understanding the insulation oil behavior over years.