Most of the subspace clustering algorithms uses monotonicity property to generate higher dimensional subspaces. But this property is not applicable here since different subspace cardinalities have varying densities i.e., if a k-dimensional unit is dense, any (k-1) dimensional projection of this unit may not be dense. So in DENCOS a mechanism to compute upper bounds of region densities to constrain the search of dense regions is devised, where the regions whose density upper bounds are lower than the density thresholds will be pruned away in identifying the dense regions. They compute the region density upper bounds by utilizing a data structure, DFP-tree to store the summarized information of the dense regions. DFP-Tree employs FP-Growth algorithm and builds an FP-Tree based on the prefix tree concept and uses it during the entire subspace identification process. This method performs repeated horizontal traversals of the data to generate relevant subspaces which is time consuming. To reduce the time complexity, we employ ITL data structure to build Density Conscious ITL (DITL) tree to be used in the entire subspace identification process. ITL reduces the cost by scanning the database only once, by significantly reducing the horizontal traversals of the database. The algorithm is evaluated through experiments on a collection of benchmark data sets datasets. Experimental results have shown favorable performance compared with other popular clustering algorithms.

C. Palanisamy1, S. Selvan2
Bannari Amman Institute of Technology, Tamil Nadu, India1, Alpha Engineering College, Chennai, Tamil Nadu, India2

Subspace Clustering, ITL Tree, Recall, Precision
Published By :
Published In :
ICTACT Journal on Soft Computing
( Volume: 1 , Issue: 3 )
Date of Publication :
January 2011

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.