Frequent itemsets mining holds a crucial position in the field of data
mining; however, traditional algorithms like Apriori and FP-Growth
often encounter efficiency and memory consumption issues when
handling large-scale datasets, which not only makes them difficult to
cope with dynamic dataset changes in some situations but also limits
their widespread use in practical applications. Therefore, a novel
DTFIMA (Dynamic Tiered Frequent Itemset Mining Algorithm,)
algorithm is proposed in this paper to address optimization problems
related to the storage and searching process of frequent itemsets by
introducing dynamic weight Huffman coding and combining it with
logarithmic frequency stratification. In ADFIM, all itemsets are
divided into three frequency levels: high, medium, and low, and
independent Huffman trees are created for each level, thus achieving
higher efficiency in frequent itemsets encoding and search.
Meanwhile, ADFIM improves the accuracy of frequent itemset mining
and enhances the algorithm's stability and reliability in handling large-
scale data by dynamically updating weights and timely cleaning low-
frequency items. Experimental results show that compared to the
traditional FP-Growth algorithm, ADFIM demonstrates higher
efficiency in handling large-scale transaction databases, especially in
dealing with dynamic data streams, significantly reducing computation
time while ensuring the accuracy and consistency of frequent itemset discovery.
Dai Xin, Hao Xue Universiti Teknologi Malaysia, Malaysia
Frequent Itemsets Mining, Huffman Coding, Logarithmic Frequency Stratification, Dynamic Data Streams
January | February | March | April | May | June | July | August | September | October | November | December |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Published By : ICTACT
Published In :
ICTACT Journal on Soft Computing ( Volume: 15 , Issue: 4 , Pages: 3682 - 3687 )
Date of Publication :
January 2025
Page Views :
8
Full Text Views :
1
|