Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution
Aiming at the problems of discontinuous segmentation of thin strip objects that were easy to blend into the surrounding background and a large number of model parameters in the semantic segmentation algorithm of traffic scenes, a lightweight Transformer traffic scene semantic segmentation algorithm...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2023-10-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023194/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841540081827971072 |
---|---|
author | Gang XIE Quanyi WANG Xinlin XIE Jian’an WANG |
author_facet | Gang XIE Quanyi WANG Xinlin XIE Jian’an WANG |
author_sort | Gang XIE |
collection | DOAJ |
description | Aiming at the problems of discontinuous segmentation of thin strip objects that were easy to blend into the surrounding background and a large number of model parameters in the semantic segmentation algorithm of traffic scenes, a lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution was proposed.First, a multi-scale strip feature extraction module (MSEM) was constructed based on deep convolution to enhance the representation ability of thin strip target features at different scales.Secondly, a spatial detail auxiliary module (SDAM) was designed using the convolutional inductive bias feature in the shallow network to compensate for the loss of deep spatial detail information to optimize object edge segmentation.Finally, an asymmetric encoding-decoding network based on the Transformer-CNN framework (TC-AEDNet) was proposed.The encoder combined Transformer and CNN to alleviate the loss of detail information and reduce the amount of model parameters; while the decoder adopted a lightweight multi-level feature fusion design to further model the global context.The proposed algorithm achieves the mean intersection over union (mIoU) of 78.63% and 81.06% respectively on the Cityscapes and CamVid traffic scene public datasets.It can achieve a trade-off between segmentation accuracy and model size in traffic scene semantic segmentation and has a good application prospect. |
format | Article |
id | doaj-art-e92804b1a4424b7db9a5ad81465d580b |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2023-10-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-e92804b1a4424b7db9a5ad81465d580b2025-01-14T06:23:36ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2023-10-014421322559388645Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolutionGang XIEQuanyi WANGXinlin XIEJian’an WANGAiming at the problems of discontinuous segmentation of thin strip objects that were easy to blend into the surrounding background and a large number of model parameters in the semantic segmentation algorithm of traffic scenes, a lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution was proposed.First, a multi-scale strip feature extraction module (MSEM) was constructed based on deep convolution to enhance the representation ability of thin strip target features at different scales.Secondly, a spatial detail auxiliary module (SDAM) was designed using the convolutional inductive bias feature in the shallow network to compensate for the loss of deep spatial detail information to optimize object edge segmentation.Finally, an asymmetric encoding-decoding network based on the Transformer-CNN framework (TC-AEDNet) was proposed.The encoder combined Transformer and CNN to alleviate the loss of detail information and reduce the amount of model parameters; while the decoder adopted a lightweight multi-level feature fusion design to further model the global context.The proposed algorithm achieves the mean intersection over union (mIoU) of 78.63% and 81.06% respectively on the Cityscapes and CamVid traffic scene public datasets.It can achieve a trade-off between segmentation accuracy and model size in traffic scene semantic segmentation and has a good application prospect.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023194/semantic segmentationdeep learningattention mechanismlightweighttraffic scene |
spellingShingle | Gang XIE Quanyi WANG Xinlin XIE Jian’an WANG Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution Tongxin xuebao semantic segmentation deep learning attention mechanism lightweight traffic scene |
title | Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution |
title_full | Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution |
title_fullStr | Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution |
title_full_unstemmed | Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution |
title_short | Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution |
title_sort | lightweight transformer traffic scene semantic segmentation algorithm integrating multi scale depth convolution |
topic | semantic segmentation deep learning attention mechanism lightweight traffic scene |
url | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023194/ |
work_keys_str_mv | AT gangxie lightweighttransformertrafficscenesemanticsegmentationalgorithmintegratingmultiscaledepthconvolution AT quanyiwang lightweighttransformertrafficscenesemanticsegmentationalgorithmintegratingmultiscaledepthconvolution AT xinlinxie lightweighttransformertrafficscenesemanticsegmentationalgorithmintegratingmultiscaledepthconvolution AT jiananwang lightweighttransformertrafficscenesemanticsegmentationalgorithmintegratingmultiscaledepthconvolution |