A hybrid attention multi-scale fusion network for real-time semantic segmentation

Abstract In semantic segmentation research, spatial information and receptive fields are essential. However, currently, most algorithms focus on acquiring semantic information and lose a significant amount of spatial information, leading to a significant decrease in accuracy despite improving real-t...

Full description

Saved in:
Bibliographic Details
Main Authors: Baofeng Ye, Renzheng Xue, Qianlong Wu
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-84685-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841544855969333248
author Baofeng Ye
Renzheng Xue
Qianlong Wu
author_facet Baofeng Ye
Renzheng Xue
Qianlong Wu
author_sort Baofeng Ye
collection DOAJ
description Abstract In semantic segmentation research, spatial information and receptive fields are essential. However, currently, most algorithms focus on acquiring semantic information and lose a significant amount of spatial information, leading to a significant decrease in accuracy despite improving real-time inference speed. This paper proposes a new method to address this issue. Specifically, we have designed a new module (HFRM) that combines channel attention and spatial attention to retrieve the spatial information lost during downsampling and enhance object classification accuracy. Regarding fusing spatial and semantic information, we have designed a new module (HFFM) to merge features of two different levels more effectively and capture a larger receptive field through an attention mechanism. Additionally, edge detection methods have been incorporated to enhance the extraction of boundary information. Experimental results demonstrate that for an input size of 512 × 1024, our proposed method achieves 73.6% mIoU at 176 frames per second (FPS) on the Cityscapes dataset and 70.0% mIoU at 146 FPS on Camvid. Compared to existing networks, our Model achieves faster inference speed while maintaining accuracy, enhancing its practicality.
format Article
id doaj-art-ebec7bbdbcc942e8a853f68700712b05
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-ebec7bbdbcc942e8a853f68700712b052025-01-12T12:15:49ZengNature PortfolioScientific Reports2045-23222025-01-0115111510.1038/s41598-024-84685-6A hybrid attention multi-scale fusion network for real-time semantic segmentationBaofeng Ye0Renzheng Xue1Qianlong Wu2School of Computer and Control Engineering, Qiqihar UniversitySchool of Computer and Control Engineering, Qiqihar UniversitySchool of Computer and Control Engineering, Qiqihar UniversityAbstract In semantic segmentation research, spatial information and receptive fields are essential. However, currently, most algorithms focus on acquiring semantic information and lose a significant amount of spatial information, leading to a significant decrease in accuracy despite improving real-time inference speed. This paper proposes a new method to address this issue. Specifically, we have designed a new module (HFRM) that combines channel attention and spatial attention to retrieve the spatial information lost during downsampling and enhance object classification accuracy. Regarding fusing spatial and semantic information, we have designed a new module (HFFM) to merge features of two different levels more effectively and capture a larger receptive field through an attention mechanism. Additionally, edge detection methods have been incorporated to enhance the extraction of boundary information. Experimental results demonstrate that for an input size of 512 × 1024, our proposed method achieves 73.6% mIoU at 176 frames per second (FPS) on the Cityscapes dataset and 70.0% mIoU at 146 FPS on Camvid. Compared to existing networks, our Model achieves faster inference speed while maintaining accuracy, enhancing its practicality.https://doi.org/10.1038/s41598-024-84685-6Semantic segmentationReal-time processingAttention mechanismEdge detectionReceptive field
spellingShingle Baofeng Ye
Renzheng Xue
Qianlong Wu
A hybrid attention multi-scale fusion network for real-time semantic segmentation
Scientific Reports
Semantic segmentation
Real-time processing
Attention mechanism
Edge detection
Receptive field
title A hybrid attention multi-scale fusion network for real-time semantic segmentation
title_full A hybrid attention multi-scale fusion network for real-time semantic segmentation
title_fullStr A hybrid attention multi-scale fusion network for real-time semantic segmentation
title_full_unstemmed A hybrid attention multi-scale fusion network for real-time semantic segmentation
title_short A hybrid attention multi-scale fusion network for real-time semantic segmentation
title_sort hybrid attention multi scale fusion network for real time semantic segmentation
topic Semantic segmentation
Real-time processing
Attention mechanism
Edge detection
Receptive field
url https://doi.org/10.1038/s41598-024-84685-6
work_keys_str_mv AT baofengye ahybridattentionmultiscalefusionnetworkforrealtimesemanticsegmentation
AT renzhengxue ahybridattentionmultiscalefusionnetworkforrealtimesemanticsegmentation
AT qianlongwu ahybridattentionmultiscalefusionnetworkforrealtimesemanticsegmentation
AT baofengye hybridattentionmultiscalefusionnetworkforrealtimesemanticsegmentation
AT renzhengxue hybridattentionmultiscalefusionnetworkforrealtimesemanticsegmentation
AT qianlongwu hybridattentionmultiscalefusionnetworkforrealtimesemanticsegmentation