EARLY ONSET DETECTION OF DIABETES USING FEATURE SELECTION AND BOOSTING TECHNIQUES

Abstract
Diabetes is one of the most common diseases present in human beings. It is well known that diabetes is a metabolic disease with no permanent cure but on early detection longevity can be increased. This research work focuses on predicting the early onset of diabetes. The diabetic dataset from UCI Machine Learning Repository is used. The necessary preprocessing techniques have been carried out to make the data more robust and suitable for further processing. This research work proposes two feature selection and ensemble boosting techniques resulting in four combinations(models) to predict the presence of diabetes in persons. Also, a novelty is introduced in further reducing the number of features selected by the feature selection techniques. The reduction in the number of features will reduce the memory and time complexity of the model. Among the models proposed, LightGBM(Light Gradient Boosting ) with Recursive Feature Elimination(RFE) as feature selector has produced better performance. Further, LightGBM with least features gave satisfactory results.

Authors
Shruti Srivatsan1, T Santhanam2
Sri Venkateswara College of Engineering, India1, DG Vaishnav College, India2

Keywords
Data Mining, Boosting, Medical Mining, Diabetes, Feature Selection
Published By :
ICTACT
Published In :
ICTACT Journal on Soft Computing
( Volume: 12 , Issue: 1 )
Date of Publication :
October 2021
DOI :

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.