CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop Platform

A cache-conscious MapReduce star join algorithm was presented,each column of fact table was separately stored,and dimension table was divided into several column families according to dimension hierarchy.Fact table foreign key column and corresponding dimension table was co-location storage,thus red...

Full description

Saved in:
Bibliographic Details
Main Authors: Guoliang Zhou, Yongli Zhu, Guilan Wang
Format: Article
Language:zho
Published: Beijing Xintong Media Co., Ltd 2013-10-01
Series:Dianxin kexue
Subjects:
Online Access:http://www.telecomsci.com/zh/article/doi/10.3969/j.issn.1000-0801.2013.10.007/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841529296534896640
author Guoliang Zhou
Yongli Zhu
Guilan Wang
author_facet Guoliang Zhou
Yongli Zhu
Guilan Wang
author_sort Guoliang Zhou
collection DOAJ
description A cache-conscious MapReduce star join algorithm was presented,each column of fact table was separately stored,and dimension table was divided into several column families according to dimension hierarchy.Fact table foreign key column and corresponding dimension table was co-location storage,thus reducing data movement in the join process.CC-MRSJ consists of two phases:firstly each foreign key column and the corresponding dimension table were joined; then the intermediate results were joined and random accessed measure columns,and so got the final result.CC-MRSJ read only the data needed,and cache utilization is high,so it has good cache conscious feature; it also takes advantage of late materialization,avoiding unnecessary data access and movement.CC-MRSJ has higher performance comparing to hive system based on SSB datasets.
format Article
id doaj-art-e49c9a527777451eaf2a166324d532ed
institution Kabale University
issn 1000-0801
language zho
publishDate 2013-10-01
publisher Beijing Xintong Media Co., Ltd
record_format Article
series Dianxin kexue
spelling doaj-art-e49c9a527777451eaf2a166324d532ed2025-01-15T03:21:29ZzhoBeijing Xintong Media Co., LtdDianxin kexue1000-08012013-10-0129313759625852CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop PlatformGuoliang ZhouYongli ZhuGuilan WangA cache-conscious MapReduce star join algorithm was presented,each column of fact table was separately stored,and dimension table was divided into several column families according to dimension hierarchy.Fact table foreign key column and corresponding dimension table was co-location storage,thus reducing data movement in the join process.CC-MRSJ consists of two phases:firstly each foreign key column and the corresponding dimension table were joined; then the intermediate results were joined and random accessed measure columns,and so got the final result.CC-MRSJ read only the data needed,and cache utilization is high,so it has good cache conscious feature; it also takes advantage of late materialization,avoiding unnecessary data access and movement.CC-MRSJ has higher performance comparing to hive system based on SSB datasets.http://www.telecomsci.com/zh/article/doi/10.3969/j.issn.1000-0801.2013.10.007/star joinMapReducecache consciousstorage model
spellingShingle Guoliang Zhou
Yongli Zhu
Guilan Wang
CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop Platform
Dianxin kexue
star join
MapReduce
cache conscious
storage model
title CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop Platform
title_full CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop Platform
title_fullStr CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop Platform
title_full_unstemmed CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop Platform
title_short CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop Platform
title_sort cc mrsj cache conscious star join algorithm on hadoop platform
topic star join
MapReduce
cache conscious
storage model
url http://www.telecomsci.com/zh/article/doi/10.3969/j.issn.1000-0801.2013.10.007/
work_keys_str_mv AT guoliangzhou ccmrsjcacheconsciousstarjoinalgorithmonhadoopplatform
AT yonglizhu ccmrsjcacheconsciousstarjoinalgorithmonhadoopplatform
AT guilanwang ccmrsjcacheconsciousstarjoinalgorithmonhadoopplatform