Improved Balanced Parallel FP-Growth with MapReduce
Abstract
The existing parallel FP-Growth algorithm have solved the problems such as the partition of transaction dataset, which can guarantee that each transaction dataset is independent after the partition, but there are still many problems such as too many iterations in the process of FP-tree mining on single node and low efficiency. What’s more, it did not consider the load balance when the master node divides dataset to the child nodes. By using Cutting strategy on PFP which is the original algorithm with MapReduce, we merged paths which are not frequent in FP-tree, and designed a new parallel FP-Growth algorithm. In addition, the load balancing strategy is used when the master nodes divide dataset to the child nodes. Through the combination of these two strategies, this paper designed a new parallel FP-Growth algorithm.
Keywords
Balanced Grouping, Parallel Computing, FP-Growth, MapReduce
DOI
10.12783/dtcse/aice-ncs2016/5681
10.12783/dtcse/aice-ncs2016/5681
Refbacks
- There are currently no refbacks.