Diabetes is one of the most common diseases present in human beings. It is well known that diabetes is a metabolic disease with no permanent cure but on early detection longevity can be increased. This research work focuses on predicting the early onset of diabetes. The diabetic dataset from UCI Machine Learning Repository is used. The necessary preprocessing techniques have been carried out to make the data more robust and suitable for further processing. This research work proposes two feature selection and ensemble boosting techniques resulting in four combinations(models) to predict the presence of diabetes in persons. Also, a novelty is introduced in further reducing the number of features selected by the feature selection techniques. The reduction in the number of features will reduce the memory and time complexity of the model. Among the models proposed, LightGBM(Light Gradient Boosting) with Recursive Feature Elimination (RFE) as feature selector has produced better performance. Further, LightGBM with least features gave satisfactory results.
Shruti Srivatsan1, T Santhanam2 Sri Venkateswara College of Engineering, India1, DG Vaishnav College, India2
Data Mining, Boosting, Medical Mining, Diabetes, Feature Selection
January | February | March | April | May | June | July | August | September | October | November | December |
0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Published By : ICTACT
Published In :
ICTACT Journal on Soft Computing ( Volume: 12 , Issue: 1 , Pages: 2474-2484 )
Date of Publication :
October 2021
Page Views :
338
Full Text Views :
1
|