A Novel Transformer-Based Object Detection Method With Geometric and Object Co-Occurrence Prior Knowledge for Remote Sensing Images
Artificial target detection is a key technology for Earth observation applications in remote sensing images, including environmental monitoring, urban planning, intelligence reconnaissance, and land mapping. Recently, transformer-based object detectors achieved great success in computer vision throu...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10804220/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Artificial target detection is a key technology for Earth observation applications in remote sensing images, including environmental monitoring, urban planning, intelligence reconnaissance, and land mapping. Recently, transformer-based object detectors achieved great success in computer vision through the attention mechanism. However, these detectors lack prior information, which may limit their multiscale, multishape, and densely distributed target detection capabilities. To address these problems, we propose a novel transformer-based object detection method with geometric and object co-occurrence prior knowledge for remote sensing images, which makes improvements based on the deformable detection transformer (DETR). First, we introduce dynamic anchor object queries with multipatterns to detect objects with multiscale and dense distribution. Second, we propose a novel distance with geometrical invariance to measure the position deviation of multishape objects. Last, we design a graph convolutional reference module with co-occurrence prior knowledge to improve the inferential ability of the detector. Experimental results confirm that the proposed method outperforms most of the state-of-the-art methods with mean average precisions (mAP) of 70.2% in the DIOR, 91.0% in the HRRSD datasets, and 91.4% in the NWPU VHR-10 dataset, respectively. The mAP values are more than 5% higher compared with those obtained by deformable DETR in above three public datasets. |
---|---|
ISSN: | 1939-1404 2151-1535 |