Enhancing Road Scene Segmentation With an Optimized DeepLabV3+
Semantic segmentation, as a dense predictive task, is inevitably affected by various external factor, making common road image semantic segmentation models unable to meet dual demands of high accuracy and real-time performance in unstructured road scenarios. To address these issues, this paper propo...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10812701/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841533385918382080 |
---|---|
author | Zhe Ren Libao Wang Tianming Song Yihang Li Jian Zhang Fengfeng Zhao |
author_facet | Zhe Ren Libao Wang Tianming Song Yihang Li Jian Zhang Fengfeng Zhao |
author_sort | Zhe Ren |
collection | DOAJ |
description | Semantic segmentation, as a dense predictive task, is inevitably affected by various external factor, making common road image semantic segmentation models unable to meet dual demands of high accuracy and real-time performance in unstructured road scenarios. To address these issues, this paper proposes an enhanced road scene segmentation method based on DeepLabV3+ that addresses the common trade-offs between accuracy and real-time performance in existing approaches. First, the heavy Xception backbone is replaced with the lightweight MobileNetV2, significantly boosting real-time efficiency while maintaining competitive segmentation accuracy. Second, the Atrous Spatial Pyramid Pooling (ASPP) module is optimized by introducing depthwise separable convolutions and a hierarchical feature fusion strategy, reducing computational complexity and mitigating the grid effect, a limitation in many current models. Finally, a Shuffle Attention mechanism is incorporated to improve the handling of small objects and fine details, such as distant pedestrians or items held by them, enhancing segmentation precision without excessive computational overhead. The method was trained and evaluated on the Cityscapes and CamVid datasets, achieving 84.3% mPA and 41.8 FPS on Cityscapes, and 78.1% mPA and 30.2 FPS on CamVid. These experimental results demonstrate a significant improvement in balancing detection capabilities with real-time performance. |
format | Article |
id | doaj-art-0a55601030534f2299f2049d1995c878 |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-0a55601030534f2299f2049d1995c8782025-01-16T00:02:00ZengIEEEIEEE Access2169-35362024-01-011219774819776510.1109/ACCESS.2024.352159710812701Enhancing Road Scene Segmentation With an Optimized DeepLabV3+Zhe Ren0Libao Wang1Tianming Song2Yihang Li3Jian Zhang4Fengfeng Zhao5https://orcid.org/0009-0008-3543-0584School of Intelligent Manufacturing, Wuxi Vocational College of Science and Technology, Wuxi, ChinaShandong Hopetry Information Technology Company, Linyi, ChinaSchool of Intelligent Manufacturing, Wuxi Vocational College of Science and Technology, Wuxi, ChinaHangzhou Wenyi Street Primary School, Hangzhou, ChinaSchool of Intelligent Manufacturing, Wuxi Vocational College of Science and Technology, Wuxi, ChinaCollege of Computer and Control Engineering, Northeast Forestry University, Harbin, ChinaSemantic segmentation, as a dense predictive task, is inevitably affected by various external factor, making common road image semantic segmentation models unable to meet dual demands of high accuracy and real-time performance in unstructured road scenarios. To address these issues, this paper proposes an enhanced road scene segmentation method based on DeepLabV3+ that addresses the common trade-offs between accuracy and real-time performance in existing approaches. First, the heavy Xception backbone is replaced with the lightweight MobileNetV2, significantly boosting real-time efficiency while maintaining competitive segmentation accuracy. Second, the Atrous Spatial Pyramid Pooling (ASPP) module is optimized by introducing depthwise separable convolutions and a hierarchical feature fusion strategy, reducing computational complexity and mitigating the grid effect, a limitation in many current models. Finally, a Shuffle Attention mechanism is incorporated to improve the handling of small objects and fine details, such as distant pedestrians or items held by them, enhancing segmentation precision without excessive computational overhead. The method was trained and evaluated on the Cityscapes and CamVid datasets, achieving 84.3% mPA and 41.8 FPS on Cityscapes, and 78.1% mPA and 30.2 FPS on CamVid. These experimental results demonstrate a significant improvement in balancing detection capabilities with real-time performance.https://ieeexplore.ieee.org/document/10812701/Autonomous drivingconvolutional neural networksdeep learningroad scene segmentationsemantic segmentation |
spellingShingle | Zhe Ren Libao Wang Tianming Song Yihang Li Jian Zhang Fengfeng Zhao Enhancing Road Scene Segmentation With an Optimized DeepLabV3+ IEEE Access Autonomous driving convolutional neural networks deep learning road scene segmentation semantic segmentation |
title | Enhancing Road Scene Segmentation With an Optimized DeepLabV3+ |
title_full | Enhancing Road Scene Segmentation With an Optimized DeepLabV3+ |
title_fullStr | Enhancing Road Scene Segmentation With an Optimized DeepLabV3+ |
title_full_unstemmed | Enhancing Road Scene Segmentation With an Optimized DeepLabV3+ |
title_short | Enhancing Road Scene Segmentation With an Optimized DeepLabV3+ |
title_sort | enhancing road scene segmentation with an optimized deeplabv3 |
topic | Autonomous driving convolutional neural networks deep learning road scene segmentation semantic segmentation |
url | https://ieeexplore.ieee.org/document/10812701/ |
work_keys_str_mv | AT zheren enhancingroadscenesegmentationwithanoptimizeddeeplabv3 AT libaowang enhancingroadscenesegmentationwithanoptimizeddeeplabv3 AT tianmingsong enhancingroadscenesegmentationwithanoptimizeddeeplabv3 AT yihangli enhancingroadscenesegmentationwithanoptimizeddeeplabv3 AT jianzhang enhancingroadscenesegmentationwithanoptimizeddeeplabv3 AT fengfengzhao enhancingroadscenesegmentationwithanoptimizeddeeplabv3 |