Parallel deep forest algorithm based on Spark and three-way interactive information

To address issues such as excessive redundancy and irrelevant features, long class vectors, slow model convergence, and low efficiency of parallel training in parallel deep forests, a parallel deep forest algorithm based on Spark and three-way interactive information was proposed.Firstly, a feature...

Full description

Saved in:
Bibliographic Details
Main Authors: Yimin MAO, Zhan ZHOU, Zhigang CHEN
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2023-08-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023143/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841540024967888896
author Yimin MAO
Zhan ZHOU
Zhigang CHEN
author_facet Yimin MAO
Zhan ZHOU
Zhigang CHEN
author_sort Yimin MAO
collection DOAJ
description To address issues such as excessive redundancy and irrelevant features, long class vectors, slow model convergence, and low efficiency of parallel training in parallel deep forests, a parallel deep forest algorithm based on Spark and three-way interactive information was proposed.Firstly, a feature selection based on feature interaction (FSFI) strategy was proposed to filter the original features and eliminate irrelevant and redundant features.Secondly, a multi-granularity vector elimination (MGVE) strategy was proposed, which fused similar class vectors and shortened the class vector length.Subsequently, the cascade forest feature enhancement (CFFE) strategy was proposed to improve the utilization of information and accelerate the convergence speed of the model.Finally, a multi-level load balancing (MLB) strategy was proposed, combined with the Spark framework, to improve the parallelization efficiency through adaptive sub-forest division and heterogeneous skew data partitioning.Experimental results demonstrate that the proposed algorithm significantly improves the model classification effect and reduces the parallelization training time.
format Article
id doaj-art-27242f1de2c748179d1ab37beda128cb
institution Kabale University
issn 1000-436X
language zho
publishDate 2023-08-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-27242f1de2c748179d1ab37beda128cb2025-01-14T06:22:52ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2023-08-014422824059386129Parallel deep forest algorithm based on Spark and three-way interactive informationYimin MAOZhan ZHOUZhigang CHENTo address issues such as excessive redundancy and irrelevant features, long class vectors, slow model convergence, and low efficiency of parallel training in parallel deep forests, a parallel deep forest algorithm based on Spark and three-way interactive information was proposed.Firstly, a feature selection based on feature interaction (FSFI) strategy was proposed to filter the original features and eliminate irrelevant and redundant features.Secondly, a multi-granularity vector elimination (MGVE) strategy was proposed, which fused similar class vectors and shortened the class vector length.Subsequently, the cascade forest feature enhancement (CFFE) strategy was proposed to improve the utilization of information and accelerate the convergence speed of the model.Finally, a multi-level load balancing (MLB) strategy was proposed, combined with the Spark framework, to improve the parallelization efficiency through adaptive sub-forest division and heterogeneous skew data partitioning.Experimental results demonstrate that the proposed algorithm significantly improves the model classification effect and reduces the parallelization training time.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023143/Spark frameworkparallel deep forest algorithmfeature selectionmultilevel load balancing
spellingShingle Yimin MAO
Zhan ZHOU
Zhigang CHEN
Parallel deep forest algorithm based on Spark and three-way interactive information
Tongxin xuebao
Spark framework
parallel deep forest algorithm
feature selection
multilevel load balancing
title Parallel deep forest algorithm based on Spark and three-way interactive information
title_full Parallel deep forest algorithm based on Spark and three-way interactive information
title_fullStr Parallel deep forest algorithm based on Spark and three-way interactive information
title_full_unstemmed Parallel deep forest algorithm based on Spark and three-way interactive information
title_short Parallel deep forest algorithm based on Spark and three-way interactive information
title_sort parallel deep forest algorithm based on spark and three way interactive information
topic Spark framework
parallel deep forest algorithm
feature selection
multilevel load balancing
url http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023143/
work_keys_str_mv AT yiminmao paralleldeepforestalgorithmbasedonsparkandthreewayinteractiveinformation
AT zhanzhou paralleldeepforestalgorithmbasedonsparkandthreewayinteractiveinformation
AT zhigangchen paralleldeepforestalgorithmbasedonsparkandthreewayinteractiveinformation