vioft2nntf2t|tblJournal|Abstract_paper|0xf4ffc7952b000000ca4b000001000b00
One of the important methods in data mining is segmentation. As each field is extended and digitized, large data sets are developed quickly. These wide clustering of data sets poses a problem for conventional sequential segmentation algorithms due to the enormous time consumed for development. Hence, distributed parallel architecture and algorithms are useful in meeting the efficiency and scalability requirements for clustering large data sets. In this analysis, we use MapReduce programming model to develop and experiment a parallel SVM algorithm, and compare the result with concurrent SVM for clustering the changing document database size. The result shows that the SVM proposed get better performance than existing methods.