Hadoop Çatısının Bulut Ortamında Gerçeklenmesi Ve Terabyte Sort Deneyleri

Hadoop framework employs MapReduce programming paradigm to process big data by distributing data across a cluster and aggregating. MapReduce is one of the methods used to process big data hosted on large clusters. In this method, jobs are processed by dividing into small pieces and distributing over...

Full description

Saved in:
Bibliographic Details
Main Authors: G. Ozen, R. Sultanov
Format: Article
Language:English
Published: Kyrgyz Turkish Manas University 2015-05-01
Series:MANAS: Journal of Engineering
Subjects:
Online Access:https://dergipark.org.tr/en/download/article-file/575941
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Hadoop framework employs MapReduce programming paradigm to process big data by distributing data across a cluster and aggregating. MapReduce is one of the methods used to process big data hosted on large clusters. In this method, jobs are processed by dividing into small pieces and distributing over nodes. The number of nodes in the cluster affect the execution time of jobs. Main idea of this paper is to determine how number of nodes affect the performance of Hadoop framework on a cloud environment with using benchmarking tools. For this purpose, various tests are carried out on a Hadoop cluster with 10 nodes hosted on a cloud environment by running Terabyte Sort benchmarking tools on it. According to test results, increasing number of nodes improves job execution performance of Hadoop framework and reduces job execution time.
ISSN:1694-7398