A Heatmap-Supplemented R-CNN Trained Using an Inflated IoU for Small Object Detection

Object detection architectures struggle to detect small objects across applications including remote sensing and autonomous vehicles. Specifically, for unmanned aerial vehicles, poor detection of small objects directly limits this technology’s applicability. Objects both appear smaller than they are...

Full description

Saved in:
Bibliographic Details
Main Authors: Justin Butler, Henry Leung
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/16/21/4065
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846173150912446464
author Justin Butler
Henry Leung
author_facet Justin Butler
Henry Leung
author_sort Justin Butler
collection DOAJ
description Object detection architectures struggle to detect small objects across applications including remote sensing and autonomous vehicles. Specifically, for unmanned aerial vehicles, poor detection of small objects directly limits this technology’s applicability. Objects both appear smaller than they are in large-scale images captured in aerial imagery and are represented by reduced information in high-altitude imagery. This paper presents a new architecture, CR-CNN, which predicts independent regions of interest from two unique prediction branches within the first stage of the network: a conventional R-CNN convolutional backbone and an hourglass backbone. Utilizing two independent sources within the first stage, our approach leads to an increase in successful predictions of regions that contain smaller objects. Anchor-based methods such as R-CNNs also utilize less than half the number of small objects compared to larger ones during training due to the poor intersection over union (IoU) scores between the generated anchors and the groundtruth—further reducing their performance on small objects. Therefore, we also propose artificially inflating the IoU of smaller objects during training using a simple, size-based Gaussian multiplier—leading to an increase in the quantity of small objects seen per training cycle based on an increase in the number of anchor–object pairs during training. This architecture and training strategy led to improved detection overall on two challenging aerial-based datasets heavily composed of small objects while predicting fewer false positives compared to Mask R-CNN. These results suggest that while new and unique architectures will continue to play a part in advancing the field of object detection, the training methodologies and strategies used will also play a valuable role.
format Article
id doaj-art-1df25575613148819892e1d9c9623a08
institution Kabale University
issn 2072-4292
language English
publishDate 2024-10-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-1df25575613148819892e1d9c9623a082024-11-08T14:40:42ZengMDPI AGRemote Sensing2072-42922024-10-011621406510.3390/rs16214065A Heatmap-Supplemented R-CNN Trained Using an Inflated IoU for Small Object DetectionJustin Butler0Henry Leung1Department of Electrical and Software Engineering, University of Calgary, Calgary, AB T2N 1N4, CanadaDepartment of Electrical and Software Engineering, University of Calgary, Calgary, AB T2N 1N4, CanadaObject detection architectures struggle to detect small objects across applications including remote sensing and autonomous vehicles. Specifically, for unmanned aerial vehicles, poor detection of small objects directly limits this technology’s applicability. Objects both appear smaller than they are in large-scale images captured in aerial imagery and are represented by reduced information in high-altitude imagery. This paper presents a new architecture, CR-CNN, which predicts independent regions of interest from two unique prediction branches within the first stage of the network: a conventional R-CNN convolutional backbone and an hourglass backbone. Utilizing two independent sources within the first stage, our approach leads to an increase in successful predictions of regions that contain smaller objects. Anchor-based methods such as R-CNNs also utilize less than half the number of small objects compared to larger ones during training due to the poor intersection over union (IoU) scores between the generated anchors and the groundtruth—further reducing their performance on small objects. Therefore, we also propose artificially inflating the IoU of smaller objects during training using a simple, size-based Gaussian multiplier—leading to an increase in the quantity of small objects seen per training cycle based on an increase in the number of anchor–object pairs during training. This architecture and training strategy led to improved detection overall on two challenging aerial-based datasets heavily composed of small objects while predicting fewer false positives compared to Mask R-CNN. These results suggest that while new and unique architectures will continue to play a part in advancing the field of object detection, the training methodologies and strategies used will also play a valuable role.https://www.mdpi.com/2072-4292/16/21/4065object detectionconvolutional neural networkMask R-CNNUAV
spellingShingle Justin Butler
Henry Leung
A Heatmap-Supplemented R-CNN Trained Using an Inflated IoU for Small Object Detection
Remote Sensing
object detection
convolutional neural network
Mask R-CNN
UAV
title A Heatmap-Supplemented R-CNN Trained Using an Inflated IoU for Small Object Detection
title_full A Heatmap-Supplemented R-CNN Trained Using an Inflated IoU for Small Object Detection
title_fullStr A Heatmap-Supplemented R-CNN Trained Using an Inflated IoU for Small Object Detection
title_full_unstemmed A Heatmap-Supplemented R-CNN Trained Using an Inflated IoU for Small Object Detection
title_short A Heatmap-Supplemented R-CNN Trained Using an Inflated IoU for Small Object Detection
title_sort heatmap supplemented r cnn trained using an inflated iou for small object detection
topic object detection
convolutional neural network
Mask R-CNN
UAV
url https://www.mdpi.com/2072-4292/16/21/4065
work_keys_str_mv AT justinbutler aheatmapsupplementedrcnntrainedusinganinflatediouforsmallobjectdetection
AT henryleung aheatmapsupplementedrcnntrainedusinganinflatediouforsmallobjectdetection
AT justinbutler heatmapsupplementedrcnntrainedusinganinflatediouforsmallobjectdetection
AT henryleung heatmapsupplementedrcnntrainedusinganinflatediouforsmallobjectdetection