Language-Guided Semantic Clustering for Remote Sensing Change Detection
Existing learning-based remote sensing change detection (RSCD) commonly uses semantic-agnostic binary masks as supervision, which hinders their ability to distinguish between different semantic types of changes, resulting in a noisy change mask prediction. To address this issue, this paper presents...
        Saved in:
      
    
          | Main Authors: | , , , , | 
|---|---|
| Format: | Article | 
| Language: | English | 
| Published: | MDPI AG
    
        2024-12-01 | 
| Series: | Sensors | 
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/24/24/7887 | 
| Tags: | Add Tag 
      No Tags, Be the first to tag this record!
   | 
| Summary: | Existing learning-based remote sensing change detection (RSCD) commonly uses semantic-agnostic binary masks as supervision, which hinders their ability to distinguish between different semantic types of changes, resulting in a noisy change mask prediction. To address this issue, this paper presents a Language-guided semantic clustering framework that can effectively transfer the rich semantic information from the contrastive language-image pretraining (CLIP) model for RSCD, dubbed LSC-CD. The LSC-CD considers the strong zero-shot generalization of the CLIP, which makes it easy to transfer the semantic knowledge from the CLIP into the CD model under semantic-agnostic binary mask supervision. Specifically, the LSC-CD first constructs a category text-prior memory bank based on the dataset statistics and then leverages the CLIP to transform the text in the memory bank into the corresponding semantic embeddings. Afterward, a CLIP adapter module (CAM) is designed to fine-tune the semantic embeddings to align with the change region embeddings from the input bi-temporal images. Next, a semantic clustering module (SCM) is designed to cluster the change region embeddings around the semantic embeddings, yielding the compact change embeddings that are robust to noisy backgrounds. Finally, a lightweight decoder is designed to decode the compact change embeddings, yielding an accurate change mask prediction. Experimental results on three public benchmarks including LEVIR-CD, WHU-CD, and SYSU-CD demonstrate that the proposed LSC-CD achieves state-of-the-art performance in terms of all evaluated metrics. | 
|---|---|
| ISSN: | 1424-8220 | 
 
       