基于MapReduce模型的并行粒子群分簇算法研究Research on parallel particle swarm optimization clustering algorithm based on MapReduce model
赵彦;孙俊;
摘要(Abstract):
在超大规模数据集的分簇管理上,存在大数据获取、存储、检索、分析和可视化等困难。面对爆炸式增长的数据,利用分布式、并行计算原理,在MapReduce模型的基础上构建并行粒子群优化算法(PSOC-MR),实现对超大规模数据的有效分簇处理。实验结果表明,PSOC-MR算法在集群节点数量与数据集大小等比例增加的情况下呈现良好的可扩展性,能在保持分簇质量的同时呈现线性加速,该算法可有效解决超大规模数据集的分簇问题,实现低成本、高性能的商用大数据分析。
关键词(KeyWords): 粒子群算法;分簇算法;并行计算;MapReduce模型;分簇处理;大数据分析
基金项目(Foundation): 国家自然科学基金(61672263);; 江苏省自然科学基金(BK20131097);; 江苏省高职院校教师专业带头人高端研修(个人访学研修)基金项目(2019GRGDYX015);; 2017年江苏高校“青蓝工程”基金资助项目(2017JSJW007);; 江苏省第五期“333工程”第三层次培养对象基金资助项目(苏人才办[2018]6号);; 学院科研课题(JSITKY201804)
作者(Author): 赵彦;孙俊;
Email:
DOI: 10.16652/j.issn.1004-373x.2021.07.027
参考文献(References):
- [1]吕国,肖瑞雪,白振荣,等.大数据挖掘中的MapReduce并行聚类优化算法研究[J].现代电子技术,2019,42(11):161-164.
- [2] KIM Y,SHIM K,KIM M S,et al. DBCURE-MR:an efficient density-based clustering algorithm for large data using MapReduce[J]. Information systems,2014,42:15-35.
- [3] LORETI D, LIPPI M, TORRONI P. Parallelizing machine learning as a service for the end-user[J]. Future generation computer systems,2020,105:275-286.
- [4] BANHARNSAKUN A. A MapReduce-based artificial bee colony for large-scale data clustering[J]. Pattern recognition letters,2017,93:78-84.
- [5] LIU Y,YANG J,HUANG Y,et al. MapReduce based parallel neural networks in enabling large scale machine learning[EB/OL].[2015-11-22]. https://www. hindawi. com/journals/cin/2015/297672/.
- [6] GHOSH P S. Parallelization of particle swarm optimization algorithm using Hadoop Mapreduce[EB/OL].[2016-12-18]. https://core.ac.uk/display/211314921.
- [7] LUDWIG S A. Running krill herd algorithm on Hadoop:a performance study[C]//2016 IEEE Congress on Evolutionary Computation. Vancouver,BC,Canada:IEEE,2016:2504-2510.
- [8] ABDEL-BASSET M,ABDLE-FATAH L,SANGAIAH A K. An improved Lévy based whale optimization algorithm for bandwidth-efficient virtual machine placement in cloud computing environment[J]. Cluster computing,2019,22:8319-8334.
- [9] DING W P,LIN C T,CHEN S B,et al. Multiagent-consensusMapReduce-based attribute reduction using co-evolutionary quantum PSO for big data applications[J]. Neurocomputing,2018,272:136-153.
- [10]赵艳萍,徐胜超.基于云计算与非负矩阵分解的数据分级聚类[J].现代电子技术,2018,41(5):56-60.
- [11] LIU H P,LI F X,XU X Y,et al. Multi-modal local receptive field extreme learning machine for object recognition[J].Neurocomputing,2018,277:4-11.
- [12] KENNEDY J,EBERHART R. Particle swarm optimization[C]//Proceedings of International Conference on Neural Networks. Los Alamitos:IEEE,1995:1942-1948.
- [13] DOULKERIDIS C,NORVAG K. A survey of large-scale analytical query processing in MapReduce[J]. VLDB journal,2014,23(3):355-380.
- [14] IBRAHIM R A,EWEES A A,OLIVA D,et al. Improved salp swarm algorithm based on particle swarm optimization for feature selection[J]. Journal of ambient intelligence and humanized computing,2019,10(8):3155-3169.
- [15] QIAN J,LüP,YUE X D,et al. Hierarchical attribute reduction algorithms for big data using MapReduce[J]. Knowledgebased systems,2015,73:18-31.