Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation
Abstract Purpose The segmentation of target volume and organs at risk (OAR) was a significant part of radiotherapy. Specifically, determining the location and scale of the esophagus in simulated computed tomography images was difficult and time-consuming primarily due to its complex structure and lo...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2024-12-01
|
| Series: | BMC Medical Imaging |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12880-024-01515-x |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846111972478681088 |
|---|---|
| author | Xiao Lou Juan Zhu Jian Yang Youzhe Zhu Huazhong Shu Baosheng Li |
| author_facet | Xiao Lou Juan Zhu Jian Yang Youzhe Zhu Huazhong Shu Baosheng Li |
| author_sort | Xiao Lou |
| collection | DOAJ |
| description | Abstract Purpose The segmentation of target volume and organs at risk (OAR) was a significant part of radiotherapy. Specifically, determining the location and scale of the esophagus in simulated computed tomography images was difficult and time-consuming primarily due to its complex structure and low contrast with the surrounding tissues. In this study, an Enhanced Cross-stage-attention U-Net was proposed to solve the segmentation problem for the esophageal gross tumor volume (GTV) and clinical tumor volume (CTV) in CT images. Methods First, a module based on principal component analysis theory was constructed to pre-extract the features of the input image. Then, a cross-stage based feature fusion model was designed to replace the skip concatenation of original UNet, which was composed of Wide Range Attention unit, Small-kernel Local Attention unit, and Inverted Bottleneck unit. WRA was employed to capture global attention, whose large convolution kernel was further decomposed to simplify the calculation. SLA was used to complement the local attention to WRA. IBN was structed to fuse the extracted features, where a global frequency response layer was built to redistribute the frequency response of the fused feature maps. Results The proposed method was compared with relevant published esophageal segmentation methods. The prediction of the proposed network was MSD = 2.83(1.62, 4.76)mm, HD = 11.79 ± 6.02 mm, DC = 72.45 ± 19.18% in GTV; MSD = 5.26(2.18, 8.82)mm, HD = 16.22 ± 10.01 mm, DC = 71.06 ± 17.72% in CTV. Conclusion The reconstruction of the skip concatenation in UNet showed an improvement of performance for esophageal segmentation. The results showed the proposed network had better effect on esophageal GTV and CTV segmentation. |
| format | Article |
| id | doaj-art-57267d0d34b140ceb8d967ef44272e48 |
| institution | Kabale University |
| issn | 1471-2342 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Medical Imaging |
| spelling | doaj-art-57267d0d34b140ceb8d967ef44272e482024-12-22T12:55:40ZengBMCBMC Medical Imaging1471-23422024-12-0124111610.1186/s12880-024-01515-xEnhanced Cross-stage-attention U-Net for esophageal target volume segmentationXiao Lou0Juan Zhu1Jian Yang2Youzhe Zhu3Huazhong Shu4Baosheng Li5Laboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast UniversityDepartment of Respiratory Medicine, The People’s Hospital of Zhangqiuqu AreaDepartment of Clinical Laboratory, The People’s Hospital of Zhangqiuqu AreaLaboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast UniversityLaboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast UniversityLaboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast UniversityAbstract Purpose The segmentation of target volume and organs at risk (OAR) was a significant part of radiotherapy. Specifically, determining the location and scale of the esophagus in simulated computed tomography images was difficult and time-consuming primarily due to its complex structure and low contrast with the surrounding tissues. In this study, an Enhanced Cross-stage-attention U-Net was proposed to solve the segmentation problem for the esophageal gross tumor volume (GTV) and clinical tumor volume (CTV) in CT images. Methods First, a module based on principal component analysis theory was constructed to pre-extract the features of the input image. Then, a cross-stage based feature fusion model was designed to replace the skip concatenation of original UNet, which was composed of Wide Range Attention unit, Small-kernel Local Attention unit, and Inverted Bottleneck unit. WRA was employed to capture global attention, whose large convolution kernel was further decomposed to simplify the calculation. SLA was used to complement the local attention to WRA. IBN was structed to fuse the extracted features, where a global frequency response layer was built to redistribute the frequency response of the fused feature maps. Results The proposed method was compared with relevant published esophageal segmentation methods. The prediction of the proposed network was MSD = 2.83(1.62, 4.76)mm, HD = 11.79 ± 6.02 mm, DC = 72.45 ± 19.18% in GTV; MSD = 5.26(2.18, 8.82)mm, HD = 16.22 ± 10.01 mm, DC = 71.06 ± 17.72% in CTV. Conclusion The reconstruction of the skip concatenation in UNet showed an improvement of performance for esophageal segmentation. The results showed the proposed network had better effect on esophageal GTV and CTV segmentation.https://doi.org/10.1186/s12880-024-01515-xEsophageal carcinomaSimulated CTEsophageal segmentationCNNUNetAttention |
| spellingShingle | Xiao Lou Juan Zhu Jian Yang Youzhe Zhu Huazhong Shu Baosheng Li Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation BMC Medical Imaging Esophageal carcinoma Simulated CT Esophageal segmentation CNN UNet Attention |
| title | Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation |
| title_full | Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation |
| title_fullStr | Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation |
| title_full_unstemmed | Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation |
| title_short | Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation |
| title_sort | enhanced cross stage attention u net for esophageal target volume segmentation |
| topic | Esophageal carcinoma Simulated CT Esophageal segmentation CNN UNet Attention |
| url | https://doi.org/10.1186/s12880-024-01515-x |
| work_keys_str_mv | AT xiaolou enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation AT juanzhu enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation AT jianyang enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation AT youzhezhu enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation AT huazhongshu enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation AT baoshengli enhancedcrossstageattentionunetforesophagealtargetvolumesegmentation |