Javascript must be enabled to continue!
An Analytical Approach for Optimizing the Performance of Hadoop Map Reduce Over RoCE
View through CrossRef
Data intensive systems aim to efficiently process “big” data. Several data processing engines have evolved over past decade. These data processing engines are modeled around the MapReduce paradigm. This article explores Hadoop's MapReduce engine and propose techniques to obtain a higher level of optimization by borrowing concepts from the world of High Performance Computing. Consequently, power consumed and heat generated is lowered. This article designs a system with a pipelined dataflow in contrast to the existing unregulated “bursty” flow of network traffic, the ability to carry out both Map and Reduce tasks in parallel, and a system which incorporates modern high-performance computing concepts using Remote Direct Memory Access (RDMA). To establish the claim of an increased performance measure of the proposed system, the authors provide an algorithm for RoCE enabled MapReduce and a mathematical derivation contrasting the runtime of vanilla Hadoop. This article proves mathematically, that the proposed system functions 1.67 times faster than the vanilla version of Hadoop.
Title: An Analytical Approach for Optimizing the Performance of Hadoop Map Reduce Over RoCE
Description:
Data intensive systems aim to efficiently process “big” data.
Several data processing engines have evolved over past decade.
These data processing engines are modeled around the MapReduce paradigm.
This article explores Hadoop's MapReduce engine and propose techniques to obtain a higher level of optimization by borrowing concepts from the world of High Performance Computing.
Consequently, power consumed and heat generated is lowered.
This article designs a system with a pipelined dataflow in contrast to the existing unregulated “bursty” flow of network traffic, the ability to carry out both Map and Reduce tasks in parallel, and a system which incorporates modern high-performance computing concepts using Remote Direct Memory Access (RDMA).
To establish the claim of an increased performance measure of the proposed system, the authors provide an algorithm for RoCE enabled MapReduce and a mathematical derivation contrasting the runtime of vanilla Hadoop.
This article proves mathematically, that the proposed system functions 1.
67 times faster than the vanilla version of Hadoop.
Related Results
Hadoop Tools
Hadoop Tools
As the name indicates, this chapter explains the various additional tools provided by Hadoop. The additional tools provided by Hadoop distribution are Hadoop Streaming, Hadoop Arch...
Secure Cloud Data with Attribute-based Honey Encryption
Secure Cloud Data with Attribute-based Honey Encryption
Abstract
Encryption is a Technique to convert plain text into Cipher text, which is unreadable without an appropriate decryption key. Hadoop is a platform to process and st...
Hadoop Ecosystem and Cloud Integration
Hadoop Ecosystem and Cloud Integration
The integration of the Hadoop ecosystem with cloud computing marks a transformative evolution in the way organizations manage and analyze large-scale data. This study examines how ...
YouTube: big data analytics using Hadoop and map reduce
YouTube: big data analytics using Hadoop and map reduce
We live today in a digital world a tremendous amount of data is generated by each digital service we use. This vast amount of data generated is called Big Data. According to Wikipe...
The Predictive Value of MAP and ETCO2 Changes After Emergency Endotracheal Intubation for Severe Cardiovascular Collapse
The Predictive Value of MAP and ETCO2 Changes After Emergency Endotracheal Intubation for Severe Cardiovascular Collapse
Abstract
Objective: To analyze the changes in mean arterial pressure (MAP) and end-tidal CO2 (ETCO2) in patients after emergency endotracheal intubation (ETI). To explore t...
The Research of Measuring Approach and Energy Efficiency for Hadoop
Periodic Jobs
The Research of Measuring Approach and Energy Efficiency for Hadoop
Periodic Jobs
Current consumption of cloud computing has attracted more and more attention of scholars. The research on
Hadoop as a cloud platform and its energy consumption has also received co...
MaxHadoop: An Efficient Scalable Emulation Tool to Test SDN Protocols in Emulated Hadoop Environments
MaxHadoop: An Efficient Scalable Emulation Tool to Test SDN Protocols in Emulated Hadoop Environments
AbstractThis paper presents MaxHadoop, a flexible and scalable emulation tool, which allows the efficient and accurate emulation of Hadoop environments over Software Defined Networ...
Re-examination of the “Joseon Map” in Fuchs' The Complete Atlas of the Imperial Territory (Kangxi Period)
Re-examination of the “Joseon Map” in Fuchs' The Complete Atlas of the Imperial Territory (Kangxi Period)
“Hwang yeo jeon lam do皇輿全覽圖(Atlas of the Chinese Empire)” of Kangxi Reign was the first map in traditional Chinese cartography to be created through the use of latitude and longitu...

