Network log analysis with SQL-on-Hadoop

With the rapid expansion of network bandwidth,devices and applications,log management is facing the challenge of exploding data volumes.Log analysis platform built on SQL-on-Hadoop is capable of storing and querying hundreds of billions of log entries effectively.Columnar and compressed data formats...

Full description

Saved in:
Bibliographic Details
Main Authors: Si-yu ZHANG, Kai-da JIANG, Jian-wen WEI, Xuan LUO, Hai-yang WANG
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2014-10-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2014.z1.004/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841539751129120768
author Si-yu ZHANG
Kai-da JIANG
Jian-wen WEI
Xuan LUO
Hai-yang WANG
author_facet Si-yu ZHANG
Kai-da JIANG
Jian-wen WEI
Xuan LUO
Hai-yang WANG
author_sort Si-yu ZHANG
collection DOAJ
description With the rapid expansion of network bandwidth,devices and applications,log management is facing the challenge of exploding data volumes.Log analysis platform built on SQL-on-Hadoop is capable of storing and querying hundreds of billions of log entries effectively.Columnar and compressed data formats for Hadoop are benchmarked with real-world multi-TB dataset.Conditional and statistical querying efficiency of Hive and Impala is tested.With gzipped parquet format,log data can be compressed by 80%,and querying with impala is 5 times faster.On this platform,six security incident analysis and detection applications are already deployed.
format Article
id doaj-art-9b341e5ac8ae4d7e84d2ac80c085afeb
institution Kabale University
issn 1000-436X
language zho
publishDate 2014-10-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-9b341e5ac8ae4d7e84d2ac80c085afeb2025-01-14T06:44:45ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2014-10-0135141959687958Network log analysis with SQL-on-HadoopSi-yu ZHANGKai-da JIANGJian-wen WEIXuan LUOHai-yang WANGWith the rapid expansion of network bandwidth,devices and applications,log management is facing the challenge of exploding data volumes.Log analysis platform built on SQL-on-Hadoop is capable of storing and querying hundreds of billions of log entries effectively.Columnar and compressed data formats for Hadoop are benchmarked with real-world multi-TB dataset.Conditional and statistical querying efficiency of Hive and Impala is tested.With gzipped parquet format,log data can be compressed by 80%,and querying with impala is 5 times faster.On this platform,six security incident analysis and detection applications are already deployed.http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2014.z1.004/og analysisbig dataHadoopSQLnetwork security
spellingShingle Si-yu ZHANG
Kai-da JIANG
Jian-wen WEI
Xuan LUO
Hai-yang WANG
Network log analysis with SQL-on-Hadoop
Tongxin xuebao
og analysis
big data
Hadoop
SQL
network security
title Network log analysis with SQL-on-Hadoop
title_full Network log analysis with SQL-on-Hadoop
title_fullStr Network log analysis with SQL-on-Hadoop
title_full_unstemmed Network log analysis with SQL-on-Hadoop
title_short Network log analysis with SQL-on-Hadoop
title_sort network log analysis with sql on hadoop
topic og analysis
big data
Hadoop
SQL
network security
url http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2014.z1.004/
work_keys_str_mv AT siyuzhang networkloganalysiswithsqlonhadoop
AT kaidajiang networkloganalysiswithsqlonhadoop
AT jianwenwei networkloganalysiswithsqlonhadoop
AT xuanluo networkloganalysiswithsqlonhadoop
AT haiyangwang networkloganalysiswithsqlonhadoop