Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution

Aiming at the problems of discontinuous segmentation of thin strip objects that were easy to blend into the surrounding background and a large number of model parameters in the semantic segmentation algorithm of traffic scenes, a lightweight Transformer traffic scene semantic segmentation algorithm...

Full description

Saved in:
Bibliographic Details
Main Authors: Gang XIE, Quanyi WANG, Xinlin XIE, Jian’an WANG
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2023-10-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023194/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841540081827971072
author Gang XIE
Quanyi WANG
Xinlin XIE
Jian’an WANG
author_facet Gang XIE
Quanyi WANG
Xinlin XIE
Jian’an WANG
author_sort Gang XIE
collection DOAJ
description Aiming at the problems of discontinuous segmentation of thin strip objects that were easy to blend into the surrounding background and a large number of model parameters in the semantic segmentation algorithm of traffic scenes, a lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution was proposed.First, a multi-scale strip feature extraction module (MSEM) was constructed based on deep convolution to enhance the representation ability of thin strip target features at different scales.Secondly, a spatial detail auxiliary module (SDAM) was designed using the convolutional inductive bias feature in the shallow network to compensate for the loss of deep spatial detail information to optimize object edge segmentation.Finally, an asymmetric encoding-decoding network based on the Transformer-CNN framework (TC-AEDNet) was proposed.The encoder combined Transformer and CNN to alleviate the loss of detail information and reduce the amount of model parameters; while the decoder adopted a lightweight multi-level feature fusion design to further model the global context.The proposed algorithm achieves the mean intersection over union (mIoU) of 78.63% and 81.06% respectively on the Cityscapes and CamVid traffic scene public datasets.It can achieve a trade-off between segmentation accuracy and model size in traffic scene semantic segmentation and has a good application prospect.
format Article
id doaj-art-e92804b1a4424b7db9a5ad81465d580b
institution Kabale University
issn 1000-436X
language zho
publishDate 2023-10-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-e92804b1a4424b7db9a5ad81465d580b2025-01-14T06:23:36ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2023-10-014421322559388645Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolutionGang XIEQuanyi WANGXinlin XIEJian’an WANGAiming at the problems of discontinuous segmentation of thin strip objects that were easy to blend into the surrounding background and a large number of model parameters in the semantic segmentation algorithm of traffic scenes, a lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution was proposed.First, a multi-scale strip feature extraction module (MSEM) was constructed based on deep convolution to enhance the representation ability of thin strip target features at different scales.Secondly, a spatial detail auxiliary module (SDAM) was designed using the convolutional inductive bias feature in the shallow network to compensate for the loss of deep spatial detail information to optimize object edge segmentation.Finally, an asymmetric encoding-decoding network based on the Transformer-CNN framework (TC-AEDNet) was proposed.The encoder combined Transformer and CNN to alleviate the loss of detail information and reduce the amount of model parameters; while the decoder adopted a lightweight multi-level feature fusion design to further model the global context.The proposed algorithm achieves the mean intersection over union (mIoU) of 78.63% and 81.06% respectively on the Cityscapes and CamVid traffic scene public datasets.It can achieve a trade-off between segmentation accuracy and model size in traffic scene semantic segmentation and has a good application prospect.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023194/semantic segmentationdeep learningattention mechanismlightweighttraffic scene
spellingShingle Gang XIE
Quanyi WANG
Xinlin XIE
Jian’an WANG
Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution
Tongxin xuebao
semantic segmentation
deep learning
attention mechanism
lightweight
traffic scene
title Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution
title_full Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution
title_fullStr Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution
title_full_unstemmed Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution
title_short Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution
title_sort lightweight transformer traffic scene semantic segmentation algorithm integrating multi scale depth convolution
topic semantic segmentation
deep learning
attention mechanism
lightweight
traffic scene
url http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023194/
work_keys_str_mv AT gangxie lightweighttransformertrafficscenesemanticsegmentationalgorithmintegratingmultiscaledepthconvolution
AT quanyiwang lightweighttransformertrafficscenesemanticsegmentationalgorithmintegratingmultiscaledepthconvolution
AT xinlinxie lightweighttransformertrafficscenesemanticsegmentationalgorithmintegratingmultiscaledepthconvolution
AT jiananwang lightweighttransformertrafficscenesemanticsegmentationalgorithmintegratingmultiscaledepthconvolution