Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation

Abstract Purpose The segmentation of target volume and organs at risk (OAR) was a significant part of radiotherapy. Specifically, determining the location and scale of the esophagus in simulated computed tomography images was difficult and time-consuming primarily due to its complex structure and lo...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiao Lou, Juan Zhu, Jian Yang, Youzhe Zhu, Huazhong Shu, Baosheng Li
Format: Article
Language:English
Published: BMC 2024-12-01
Series:BMC Medical Imaging
Subjects:
Online Access:https://doi.org/10.1186/s12880-024-01515-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846111972478681088
author Xiao Lou
Juan Zhu
Jian Yang
Youzhe Zhu
Huazhong Shu
Baosheng Li
author_facet Xiao Lou
Juan Zhu
Jian Yang
Youzhe Zhu
Huazhong Shu
Baosheng Li
author_sort Xiao Lou
collection DOAJ
description Abstract Purpose The segmentation of target volume and organs at risk (OAR) was a significant part of radiotherapy. Specifically, determining the location and scale of the esophagus in simulated computed tomography images was difficult and time-consuming primarily due to its complex structure and low contrast with the surrounding tissues. In this study, an Enhanced Cross-stage-attention U-Net was proposed to solve the segmentation problem for the esophageal gross tumor volume (GTV) and clinical tumor volume (CTV) in CT images. Methods First, a module based on principal component analysis theory was constructed to pre-extract the features of the input image. Then, a cross-stage based feature fusion model was designed to replace the skip concatenation of original UNet, which was composed of Wide Range Attention unit, Small-kernel Local Attention unit, and Inverted Bottleneck unit. WRA was employed to capture global attention, whose large convolution kernel was further decomposed to simplify the calculation. SLA was used to complement the local attention to WRA. IBN was structed to fuse the extracted features, where a global frequency response layer was built to redistribute the frequency response of the fused feature maps. Results The proposed method was compared with relevant published esophageal segmentation methods. The prediction of the proposed network was MSD = 2.83(1.62, 4.76)mm, HD = 11.79 ± 6.02 mm, DC = 72.45 ± 19.18% in GTV; MSD = 5.26(2.18, 8.82)mm, HD = 16.22 ± 10.01 mm, DC = 71.06 ± 17.72% in CTV. Conclusion The reconstruction of the skip concatenation in UNet showed an improvement of performance for esophageal segmentation. The results showed the proposed network had better effect on esophageal GTV and CTV segmentation.
format Article
id doaj-art-57267d0d34b140ceb8d967ef44272e48
institution Kabale University
issn 1471-2342
language English
publishDate 2024-12-01
publisher BMC
record_format Article
series BMC Medical Imaging
spelling doaj-art-57267d0d34b140ceb8d967ef44272e482024-12-22T12:55:40ZengBMCBMC Medical Imaging1471-23422024-12-0124111610.1186/s12880-024-01515-xEnhanced Cross-stage-attention U-Net for esophageal target volume segmentationXiao Lou0Juan Zhu1Jian Yang2Youzhe Zhu3Huazhong Shu4Baosheng Li5Laboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast UniversityDepartment of Respiratory Medicine, The People’s Hospital of Zhangqiuqu AreaDepartment of Clinical Laboratory, The People’s Hospital of Zhangqiuqu AreaLaboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast UniversityLaboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast UniversityLaboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast UniversityAbstract Purpose The segmentation of target volume and organs at risk (OAR) was a significant part of radiotherapy. Specifically, determining the location and scale of the esophagus in simulated computed tomography images was difficult and time-consuming primarily due to its complex structure and low contrast with the surrounding tissues. In this study, an Enhanced Cross-stage-attention U-Net was proposed to solve the segmentation problem for the esophageal gross tumor volume (GTV) and clinical tumor volume (CTV) in CT images. Methods First, a module based on principal component analysis theory was constructed to pre-extract the features of the input image. Then, a cross-stage based feature fusion model was designed to replace the skip concatenation of original UNet, which was composed of Wide Range Attention unit, Small-kernel Local Attention unit, and Inverted Bottleneck unit. WRA was employed to capture global attention, whose large convolution kernel was further decomposed to simplify the calculation. SLA was used to complement the local attention to WRA. IBN was structed to fuse the extracted features, where a global frequency response layer was built to redistribute the frequency response of the fused feature maps. Results The proposed method was compared with relevant published esophageal segmentation methods. The prediction of the proposed network was MSD = 2.83(1.62, 4.76)mm, HD = 11.79 ± 6.02 mm, DC = 72.45 ± 19.18% in GTV; MSD = 5.26(2.18, 8.82)mm, HD = 16.22 ± 10.01 mm, DC = 71.06 ± 17.72% in CTV. Conclusion The reconstruction of the skip concatenation in UNet showed an improvement of performance for esophageal segmentation. The results showed the proposed network had better effect on esophageal GTV and CTV segmentation.https://doi.org/10.1186/s12880-024-01515-xEsophageal carcinomaSimulated CTEsophageal segmentationCNNUNetAttention
spellingShingle Xiao Lou
Juan Zhu
Jian Yang
Youzhe Zhu
Huazhong Shu
Baosheng Li
Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation
BMC Medical Imaging
Esophageal carcinoma
Simulated CT
Esophageal segmentation
CNN
UNet
Attention
title Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation
title_full Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation
title_fullStr Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation
title_full_unstemmed Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation
title_short Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation
title_sort enhanced cross stage attention u net for esophageal target volume segmentation
topic Esophageal carcinoma
Simulated CT
Esophageal segmentation
CNN
UNet
Attention
url https://doi.org/10.1186/s12880-024-01515-x
work_keys_str_mv AT xiaolou enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation
AT juanzhu enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation
AT jianyang enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation
AT youzhezhu enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation
AT huazhongshu enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation
AT baoshengli enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation