A Novel Wrapper–Based Feature Selection for Early Diabetes Prediction Enhanced With a Metaheuristic

A Novel Wrapper–Based Feature Selection for Early Diabetes Prediction Enhanced With a Metaheuristic

Abstract:

Diabetes leads to health problems for hundreds of millions of people globally every year. Available medical records of patients quantify symptoms, body features, and clinical laboratory test values, which can be used to perform biostatistics analysis aimed at finding patterns or features undetectable by current practice. In this work, we proposed a machine learning model to predict the early onset of diabetes patients. It is a novel wrapper-based feature selection utilizing Grey Wolf Optimization (GWO) and an Adaptive Particle Swam Optimization (APSO) to optimize the Multilayer Perceptron (MLP) to reduce the number of required input attributes. Moreover, we also compared the results achieved using this method and several conventional machine learning algorithms approaches such as Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neighbor (KNN), Naïve Bayesian Classifier (NBC), Random Forest Classifier (RFC), Logistic Regression (LR). Computational results of our proposed method show not only that much fewer features are needed, but also higher prediction accuracy can be achieved (96% for GWO - MLP and 97% for APGWO - MLP). This work has the potential to be applicable to clinical practice and become a supporting tool for doctors/physicians.