EFFICIENT FREQUENT ITEMSET DISCOVERY THROUGH HIERARCHICAL HUFFMAN ENCODING

ICTACT Journal on Soft Computing ( Volume: 15 , Issue: 4 )

Abstract

Frequent itemsets mining holds a crucial position in the field of data mining; however, traditional algorithms like Apriori and FP-Growth often encounter efficiency and memory consumption issues when handling large-scale datasets, which not only makes them difficult to cope with dynamic dataset changes in some situations but also limits their widespread use in practical applications. Therefore, a novel DTFIMA (Dynamic Tiered Frequent Itemset Mining Algorithm,) algorithm is proposed in this paper to address optimization problems related to the storage and searching process of frequent itemsets by introducing dynamic weight Huffman coding and combining it with logarithmic frequency stratification. In ADFIM, all itemsets are divided into three frequency levels: high, medium, and low, and independent Huffman trees are created for each level, thus achieving higher efficiency in frequent itemsets encoding and search. Meanwhile, ADFIM improves the accuracy of frequent itemset mining and enhances the algorithm's stability and reliability in handling large- scale data by dynamically updating weights and timely cleaning low- frequency items. Experimental results show that compared to the traditional FP-Growth algorithm, ADFIM demonstrates higher efficiency in handling large-scale transaction databases, especially in dealing with dynamic data streams, significantly reducing computation time while ensuring the accuracy and consistency of frequent itemset discovery.

Authors

Dai Xin, Hao Xue
Universiti Teknologi Malaysia, Malaysia

Keywords

Frequent Itemsets Mining, Huffman Coding, Logarithmic Frequency Stratification, Dynamic Data Streams

Published By
ICTACT
Published In
ICTACT Journal on Soft Computing
( Volume: 15 , Issue: 4 )
Date of Publication
January 2025
Pages
3682 - 3687
Page Views
237
Full Text Views
14

ICT Academy is an initiative of the Government of India in collaboration with the state Governments and Industries. ICT Academy is a not-for-profit society, the first of its kind pioneer venture under the Public-Private-Partnership (PPP) model

Contact Us

ICT Academy
Module No E6 -03, 6th floor Block - E
IIT Madras Research Park
Kanagam Road, Taramani,
Chennai 600 113,
Tamil Nadu, India

For Journal Subscription: journalsales@ictacademy.in

For further Queries and Assistance, write to us at: ictacademy.journal@ictacademy.in