Xi'an Technological University
Subject: Computer Science, Software Engineering
eISSN: 2470-8038
SEARCH WITHIN CONTENT
E. Laxmi Lydia * / M.Ben Swarup
Keywords : K-Centroids Clustering, Big data, Hadoop Cluster, data access locality, data replication, systemreliability, particle swarm optimization
Citation Information : International Journal of Advanced Network, Monitoring and Controls. Volume 1, Issue 2, Pages 34-46, DOI: https://doi.org/10.21307/ijanmc-2016-014
License : (CC BY 4.0)
Published Online: 02-April-2018
Big data storage management is one of the most challenging issues for Hadoop cluster environments, since large amount of data intensive applications frequently involve a high degree of data access locality. In traditional approaches high-performance computing consists dedicated servers that are used to data storage and data replication. Therefore to solve the problems of Disparateness among the jobs and resources a “Disparateness-Aware Scheduling algorithm” is proposed in the cluster environment. In this research work we represent K-centroids clustering in big data mechanism for Hadoop cluster. This approach is mainly focused on the energy consumption in The Hadoop cluster, which helps to increase the system reliability. The Hadoop cluster consists of resources which are categorized for minimizing the scheduling delay in the Hadoop cluster using the K-Centroids clustering algorithm. A novel provisioning mechanism is introduced along with the consideration of load, energy, and network time. By integrating these three parameters, the optimized fitness function is employed for Particle Swarm Optimization (PSO) to select the computing node. Failure may occur after completion of the successful execution in the network. To improve the fault tolerance service, the migration of the cluster is focused on the particular failure node. This can recomputed the node by PSO and the corresponding optimal node is predicted. The experimental results exhibit better scheduling length, scheduling delay, speed up, failure ratio, energy consumption than the existing systems.
V. Mayer-Schonberger, K. Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think, Houghton Mifflin Harcourt, 2013.
A. Cuzzocrea, Privacy and security of big data: current challenges and future research perspectives, in: Proceedings of the First International Workshop on Privacy and Securityof Big Data, PSBD ’14, 2014.
Big data, Nature 455(7209) (2008) 1–136.
Dealing with data, Science 331(6018) (2011) 639–806.
C. O’Neil, R. Schutt, Doing Data Science: Straight Talk from the Frontline, O’Reilly Media, Inc., 2013.
Big data, http://en.wikipedia.org/wiki/Big_data, 2014.
G. Li, X. Cheng, Research status and scientific thinking of big data, Bull. Chin. Acad. Sci. 27(6) (2012) 647–657.
Y. Wang, X. Jin Xueqi, Network big data: present and future, Chinese J. Comput. 36(6) (2013) 1125–1138.
X.-Q. Cheng, X. Jin, Y. Wang, J. Guo, T. Zhang, G. Li, Survey on big data system and analytic technology, J. Softw. 25(9) (2014) 1889–1908.
J.Dean,S.Ghemawa.MapReduce:Simplified Data Processing on Large Cluster.OSDI’04,Sixth Symposium on Operating System Design and Implementation, SanFrancisco , CA ,December,2004
http://www.vmware.com/appliances/directory/up loaded_files/What%20is%20Hadoop.pdf.
Haiyang Li ―PWBRR Algorithm of Hadoop Platform.