MGLI-Former: a multi-scale and global-local information interactive attention transformer for urban shantytown extraction

Shantytowns, characterized by poor living conditions and simple houses, necessitate efficient extraction and analysis for urban planning. This paper proposes a multi-scale and global-local information interactive attention transformer (MGLI-Former) for shantytown extraction from high-resolution remo...

Full description

Saved in:
Bibliographic Details
Main Authors: Shouhang Du, Shaoyu Wang, Yuhao Hua, Shu Peng, Fei Qin, Xue Li, Yufei Wu
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:International Journal of Digital Earth
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/17538947.2024.2432522
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Shantytowns, characterized by poor living conditions and simple houses, necessitate efficient extraction and analysis for urban planning. This paper proposes a multi-scale and global-local information interactive attention transformer (MGLI-Former) for shantytown extraction from high-resolution remote sensing images. First, the multi-level feature fusion block (MLFFB) integrates neighborhood encoding features to prevent the loss of small-target shantytowns. Second, joint global and local information transformer blocks (JGLB) effectively combine global and local features. Finally, boundary and feature joint optimization loss (BF-Loss) refines the output by edges and high-level semantics. Experiments in Beijing and Shanghai demonstrate the MGLI-Former achieved optimal visual and quantitative extraction evaluations. The F1-score, IoU, Precision, and Recall are 86.92%, 76.87%, 86.84%, 87.01% and 72.33%, 56.66%, 69.29%, 75.65%, respectively. Furthermore, the use of UIS-Shenzhen datasets and fine-tuning experiments with mixed datasets further validate the robustness and generalization capabilities of MGLI-Former. Moreover, spatial and landscape patterns of shantytowns in Beijing and Shanghai reveal: (1) Beijing’s shantytowns radiate uniformly from the old city center, whereas Shanghai exhibits a multi-core diffusion pattern. (2) Shanghai's shantytown distribution is clustered, while Beijing's shantytown distribution is more uniform. MGLI-Former demonstrates the potential for extracting shantytowns and has significant urban planning and management implications.
ISSN:1753-8947
1753-8955