vioft2nntf2t|tblJournal|Abstract_paper|0xf4ffb20b2d0000002fb60c0001000400
Cluster analysis of data is a crucial tool for discovering and making sense of a dataset underlying structure. It has been put to use in many contexts and many different fields with great success. In addition, new innovations in the last decade have piqued the interest of clinical researchers, scientists, and biologists. As the number of dimensions in a data set grows, the consensus function of traditional ensemble clustering often fails to generate final clusters. The main problem with conventional ensemble clustering is exactly this. The proposed work employs a similarity measure between links to identify which clusters contain the unknown datasets. To this end, this study proposes employing an improved ensemble framework for clustering categorical datasets. More specifically, it employs ensemble machine learning methods to categorize data. Multiple machine learning algorithms are incorporated into this model. Objective performance indicators are used to compare a model to more traditional approaches to determine how effective each the proposed method is.