VIRTUAL DISK BASED DATA CLUSTERING MAPREDUCE FRAMEWORK

Abstract
One of the important methods in data mining is segmentation. As each field is extended and digitized, large data sets are developed quickly. These wide clustering of data sets poses a problem for conventional sequential segmentation algorithms due to the enormous time consumed for development. Hence, distributed parallel architecture and algorithms are useful in meeting the efficiency and scalability requirements for clustering large data sets. In this analysis, we use MapReduce programming model to develop and experiment a parallel SVM algorithm, and compare the result with concurrent SVM for clustering the changing document database size. The result shows that the SVM proposed get better performance than existing methods.

Authors
M Vijayalakshmi, T Vinodh Kannan
Mookambigai College of Engineering, India

Keywords
Cloud Computing, Map Reduce, Clustering, Domain Clustering
Published By :
ICTACT
Published In :
ICTACT Journal on Data Science and Machine Learning
( Volume: 1 , Issue: 2 )
Date of Publication :
March 2020

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.