RiskTree: Decision trees for asset and process risk assessment quantification in big data platforms
Currently, big data platforms are widely applied across various industries. These platforms are characterized by large scale, diverse forms, high update frequency, and rapid data flow, making it challenging to directly apply existing risk quantification methods to them. Additionally, the composition...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2024-01-01
|
Series: | Security and Safety |
Subjects: | |
Online Access: | https://sands.edpsciences.org/articles/sands/full_html/2024/01/sands20240012/sands20240012.html |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841554682372161536 |
---|---|
author | Zhan Haomou Yang Jiawei Guo Zhenyang Cao Jin Zhang Dong Zhao Xingwen You Wei Li Hui |
author_facet | Zhan Haomou Yang Jiawei Guo Zhenyang Cao Jin Zhang Dong Zhao Xingwen You Wei Li Hui |
author_sort | Zhan Haomou |
collection | DOAJ |
description | Currently, big data platforms are widely applied across various industries. These platforms are characterized by large scale, diverse forms, high update frequency, and rapid data flow, making it challenging to directly apply existing risk quantification methods to them. Additionally, the composition of big data platforms varies among enterprises due to factors such as industry, economic capability, and technical proficiency. To address this, we first developed a risk quantification assessment process tailored to different types of big data platforms, taking into account relevant laws, regulations, and standards. Subsequently, we developed RiskTree, a risk quantification system for big data platforms, which supports automated detection of configuration files, traffic, and vulnerabilities. For situations where automated detection is not feasible or permitted, we provide a customized questionnaire system to collect assets and data processing procedures. We utilize a knowledge graph (KG) to integrate and analyze the collected data. Finally, we apply a random forest algorithm to compute risk index weights, risk values, and risk levels, enabling the quantification of risks on big data platforms. To validate the proposed process, we conducted experiments on an educational big data platform. The results demonstrate that the risk index system presented in this paper objectively and comprehensively reflects the risks faced by big data platforms. Furthermore, the proposed risk assessment process not only effectively identifies and quantifies risks but also provides highly interpretable evaluation results. |
format | Article |
id | doaj-art-0f0c1d8900a74b43bd4d09e572d4af65 |
institution | Kabale University |
issn | 2826-1275 |
language | English |
publishDate | 2024-01-01 |
publisher | EDP Sciences |
record_format | Article |
series | Security and Safety |
spelling | doaj-art-0f0c1d8900a74b43bd4d09e572d4af652025-01-08T11:21:32ZengEDP SciencesSecurity and Safety2826-12752024-01-013202400910.1051/sands/2024009sands20240012RiskTree: Decision trees for asset and process risk assessment quantification in big data platformsZhan Haomou0https://orcid.org/0009-0002-6680-7694Yang Jiawei1Guo Zhenyang2https://orcid.org/0000-0003-3519-2674Cao Jin3https://orcid.org/0000-0003-1372-7252Zhang Dong4Zhao Xingwen5You Wei6Li Hui7https://orcid.org/0000-0001-8310-7169School of Cyber Engineering, Xidian UniversitySchool of Cyber Engineering, Xidian UniversitySchool of Cyber Engineering, Xidian UniversitySchool of Cyber Engineering, Xidian UniversityNational Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC)School of Cyber Engineering, Xidian UniversitySchool of Cyber Engineering, Xidian UniversitySchool of Cyber Engineering, Xidian UniversityCurrently, big data platforms are widely applied across various industries. These platforms are characterized by large scale, diverse forms, high update frequency, and rapid data flow, making it challenging to directly apply existing risk quantification methods to them. Additionally, the composition of big data platforms varies among enterprises due to factors such as industry, economic capability, and technical proficiency. To address this, we first developed a risk quantification assessment process tailored to different types of big data platforms, taking into account relevant laws, regulations, and standards. Subsequently, we developed RiskTree, a risk quantification system for big data platforms, which supports automated detection of configuration files, traffic, and vulnerabilities. For situations where automated detection is not feasible or permitted, we provide a customized questionnaire system to collect assets and data processing procedures. We utilize a knowledge graph (KG) to integrate and analyze the collected data. Finally, we apply a random forest algorithm to compute risk index weights, risk values, and risk levels, enabling the quantification of risks on big data platforms. To validate the proposed process, we conducted experiments on an educational big data platform. The results demonstrate that the risk index system presented in this paper objectively and comprehensively reflects the risks faced by big data platforms. Furthermore, the proposed risk assessment process not only effectively identifies and quantifies risks but also provides highly interpretable evaluation results.https://sands.edpsciences.org/articles/sands/full_html/2024/01/sands20240012/sands20240012.htmlbig data platformquantitative risk assessmentmachine learningbig data platformquantitative risk assessmentmachine learning |
spellingShingle | Zhan Haomou Yang Jiawei Guo Zhenyang Cao Jin Zhang Dong Zhao Xingwen You Wei Li Hui RiskTree: Decision trees for asset and process risk assessment quantification in big data platforms Security and Safety big data platform quantitative risk assessment machine learning big data platform quantitative risk assessment machine learning |
title | RiskTree: Decision trees for asset and process risk assessment quantification in big data platforms |
title_full | RiskTree: Decision trees for asset and process risk assessment quantification in big data platforms |
title_fullStr | RiskTree: Decision trees for asset and process risk assessment quantification in big data platforms |
title_full_unstemmed | RiskTree: Decision trees for asset and process risk assessment quantification in big data platforms |
title_short | RiskTree: Decision trees for asset and process risk assessment quantification in big data platforms |
title_sort | risktree decision trees for asset and process risk assessment quantification in big data platforms |
topic | big data platform quantitative risk assessment machine learning big data platform quantitative risk assessment machine learning |
url | https://sands.edpsciences.org/articles/sands/full_html/2024/01/sands20240012/sands20240012.html |
work_keys_str_mv | AT zhanhaomou risktreedecisiontreesforassetandprocessriskassessmentquantificationinbigdataplatforms AT yangjiawei risktreedecisiontreesforassetandprocessriskassessmentquantificationinbigdataplatforms AT guozhenyang risktreedecisiontreesforassetandprocessriskassessmentquantificationinbigdataplatforms AT caojin risktreedecisiontreesforassetandprocessriskassessmentquantificationinbigdataplatforms AT zhangdong risktreedecisiontreesforassetandprocessriskassessmentquantificationinbigdataplatforms AT zhaoxingwen risktreedecisiontreesforassetandprocessriskassessmentquantificationinbigdataplatforms AT youwei risktreedecisiontreesforassetandprocessriskassessmentquantificationinbigdataplatforms AT lihui risktreedecisiontreesforassetandprocessriskassessmentquantificationinbigdataplatforms |