Hadoop bottleneck detection algorithm based on information gain

Hadoop has become a major platform for big data storage and large data mining nowadays.Although Hadoop platform achieves high performance parallel computing through a distributed cluster of machines,the bottlenecks will inevitably appear on a machine when cluster load increases,because the cluster i...

Full description

Saved in:
Bibliographic Details
Main Authors: Zaole TAN, Zhifeng HAO, Ruichu CAI, Xiaojun XIAO, Yu LU
Format: Article
Language:zho
Published: Beijing Xintong Media Co., Ltd 2016-07-01
Series:Dianxin kexue
Subjects:
Online Access:http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016203/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841529090777022464
author Zaole TAN
Zhifeng HAO
Ruichu CAI
Xiaojun XIAO
Yu LU
author_facet Zaole TAN
Zhifeng HAO
Ruichu CAI
Xiaojun XIAO
Yu LU
author_sort Zaole TAN
collection DOAJ
description Hadoop has become a major platform for big data storage and large data mining nowadays.Although Hadoop platform achieves high performance parallel computing through a distributed cluster of machines,the bottlenecks will inevitably appear on a machine when cluster load increases,because the cluster is composed of inexpensive host.Aiming at this problem,a bottleneck detection algorithms based on information gain was proposed.The algorithm detected cluster's bottlenecks resource by computing the information gain of each resource.The experiments show that the bottleneck detection algorithm is feasible.
format Article
id doaj-art-5b8b889e2fbb4f898a93d0ffea34d7cb
institution Kabale University
issn 1000-0801
language zho
publishDate 2016-07-01
publisher Beijing Xintong Media Co., Ltd
record_format Article
series Dianxin kexue
spelling doaj-art-5b8b889e2fbb4f898a93d0ffea34d7cb2025-01-15T03:25:10ZzhoBeijing Xintong Media Co., LtdDianxin kexue1000-08012016-07-013211512059801101Hadoop bottleneck detection algorithm based on information gainZaole TANZhifeng HAORuichu CAIXiaojun XIAOYu LUHadoop has become a major platform for big data storage and large data mining nowadays.Although Hadoop platform achieves high performance parallel computing through a distributed cluster of machines,the bottlenecks will inevitably appear on a machine when cluster load increases,because the cluster is composed of inexpensive host.Aiming at this problem,a bottleneck detection algorithms based on information gain was proposed.The algorithm detected cluster's bottlenecks resource by computing the information gain of each resource.The experiments show that the bottleneck detection algorithm is feasible.http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016203/big dataHadoopinformation gainbottleneck detection
spellingShingle Zaole TAN
Zhifeng HAO
Ruichu CAI
Xiaojun XIAO
Yu LU
Hadoop bottleneck detection algorithm based on information gain
Dianxin kexue
big data
Hadoop
information gain
bottleneck detection
title Hadoop bottleneck detection algorithm based on information gain
title_full Hadoop bottleneck detection algorithm based on information gain
title_fullStr Hadoop bottleneck detection algorithm based on information gain
title_full_unstemmed Hadoop bottleneck detection algorithm based on information gain
title_short Hadoop bottleneck detection algorithm based on information gain
title_sort hadoop bottleneck detection algorithm based on information gain
topic big data
Hadoop
information gain
bottleneck detection
url http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016203/
work_keys_str_mv AT zaoletan hadoopbottleneckdetectionalgorithmbasedoninformationgain
AT zhifenghao hadoopbottleneckdetectionalgorithmbasedoninformationgain
AT ruichucai hadoopbottleneckdetectionalgorithmbasedoninformationgain
AT xiaojunxiao hadoopbottleneckdetectionalgorithmbasedoninformationgain
AT yulu hadoopbottleneckdetectionalgorithmbasedoninformationgain