EFFICIENT FREQUENT ITEMSET DISCOVERY THROUGH HIERARCHICAL HUFFMAN ENCODING
Abstract
Frequent itemsets mining holds a crucial position in the field of data mining; however, traditional algorithms like Apriori and FP-Growth often encounter efficiency and memory consumption issues when handling large-scale datasets, which not only makes them difficult to cope with dynamic dataset changes in some situations but also limits their widespread use in practical applications. Therefore, a novel DTFIMA (Dynamic Tiered Frequent Itemset Mining Algorithm,) algorithm is proposed in this paper to address optimization problems related to the storage and searching process of frequent itemsets by introducing dynamic weight Huffman coding and combining it with logarithmic frequency stratification. In ADFIM, all itemsets are divided into three frequency levels: high, medium, and low, and independent Huffman trees are created for each level, thus achieving higher efficiency in frequent itemsets encoding and search. Meanwhile, ADFIM improves the accuracy of frequent itemset mining and enhances the algorithm's stability and reliability in handling large- scale data by dynamically updating weights and timely cleaning low- frequency items. Experimental results show that compared to the traditional FP-Growth algorithm, ADFIM demonstrates higher efficiency in handling large-scale transaction databases, especially in dealing with dynamic data streams, significantly reducing computation time while ensuring the accuracy and consistency of frequent itemset discovery.

Authors
Dai Xin, Hao Xue
Universiti Teknologi Malaysia, Malaysia

Keywords
Frequent Itemsets Mining, Huffman Coding, Logarithmic Frequency Stratification, Dynamic Data Streams
Yearly Full Views
JanuaryFebruaryMarchAprilMayJuneJulyAugustSeptemberOctoberNovemberDecember
100000000000
Published By :
ICTACT
Published In :
ICTACT Journal on Soft Computing
( Volume: 15 , Issue: 4 , Pages: 3682 - 3687 )
Date of Publication :
January 2025
Page Views :
8
Full Text Views :
1

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.