Improved Balanced Parallel FP-Growth with MapReduce

Qing YANG, Fei-Yang DU, Xi ZHU, Cheng-Gong JIANG

Abstract


The existing parallel FP-Growth algorithm have solved the problems such as the partition of transaction dataset, which can guarantee that each transaction dataset is independent after the partition, but there are still many problems such as too many iterations in the process of FP-tree mining on single node and low efficiency. What’s more, it did not consider the load balance when the master node divides dataset to the child nodes. By using Cutting strategy on PFP which is the original algorithm with MapReduce, we merged paths which are not frequent in FP-tree, and designed a new parallel FP-Growth algorithm. In addition, the load balancing strategy is used when the master nodes divide dataset to the child nodes. Through the combination of these two strategies, this paper designed a new parallel FP-Growth algorithm.

Keywords


Balanced Grouping, Parallel Computing, FP-Growth, MapReduce


DOI
10.12783/dtcse/aice-ncs2016/5681

Refbacks

  • There are currently no refbacks.