Region Boosting for Real-Time Object Detection Using Multi-Dimensional Attention
Real-time object detection remains an important topic in computer vision. Balancing the accuracy and speed of object detectors is a formidable challenge for both academic researchers and industry practitioners. In this paper, considering the latest models may be somewhat over-optimized for anchor-fr...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10745475/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846159849364127744 |
|---|---|
| author | Jinlong Chen Kejian Xu Yi Ning Zhi Xu |
| author_facet | Jinlong Chen Kejian Xu Yi Ning Zhi Xu |
| author_sort | Jinlong Chen |
| collection | DOAJ |
| description | Real-time object detection remains an important topic in computer vision. Balancing the accuracy and speed of object detectors is a formidable challenge for both academic researchers and industry practitioners. In this paper, considering the latest models may be somewhat over-optimized for anchor-free pipes, we elect to use YOLOX as our baseline and introduce a series of enhancements, forming in a new high-performance detector named YOLOAX. To further exploit the power of the attention mechanism, we devise multi-dimensional attention-based modules which can activate CNNs, emphasizing regions of interest and boosting the capacity to learn the informative representations from feature maps. Moreover, we introduce a new label assignment strategy called STA, along with a novel loss function named GEIOU Loss, to further refine our object detector’s performance. Extensive ablation studies on the COCO and PASCAL VOC 2012 datasets are provided to validate our proposed methods. Our YOLOAX series is trained solely on the COCO dataset from scratch, without any prior knowledge, surpassing YOLOX series by a margin of 4.0% AP. Especially, YOLOAX-X achieves an impressive 55.2% AP on the COCO 2017 test set while maintaining a real-time speed of 82.4 fps. |
| format | Article |
| id | doaj-art-d31265deb2d54474b4a069ca385ee45b |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-d31265deb2d54474b4a069ca385ee45b2024-11-23T00:01:02ZengIEEEIEEE Access2169-35362024-01-011217163417164310.1109/ACCESS.2024.349249310745475Region Boosting for Real-Time Object Detection Using Multi-Dimensional AttentionJinlong Chen0Kejian Xu1https://orcid.org/0009-0003-2647-2196Yi Ning2Zhi Xu3https://orcid.org/0000-0001-8665-1020School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, ChinaSchool of Continuing Education, Guilin University of Electronic Technology, Guilin, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, ChinaReal-time object detection remains an important topic in computer vision. Balancing the accuracy and speed of object detectors is a formidable challenge for both academic researchers and industry practitioners. In this paper, considering the latest models may be somewhat over-optimized for anchor-free pipes, we elect to use YOLOX as our baseline and introduce a series of enhancements, forming in a new high-performance detector named YOLOAX. To further exploit the power of the attention mechanism, we devise multi-dimensional attention-based modules which can activate CNNs, emphasizing regions of interest and boosting the capacity to learn the informative representations from feature maps. Moreover, we introduce a new label assignment strategy called STA, along with a novel loss function named GEIOU Loss, to further refine our object detector’s performance. Extensive ablation studies on the COCO and PASCAL VOC 2012 datasets are provided to validate our proposed methods. Our YOLOAX series is trained solely on the COCO dataset from scratch, without any prior knowledge, surpassing YOLOX series by a margin of 4.0% AP. Especially, YOLOAX-X achieves an impressive 55.2% AP on the COCO 2017 test set while maintaining a real-time speed of 82.4 fps.https://ieeexplore.ieee.org/document/10745475/Real-time object detectionattention mechanismregion boostinglabel assignment strategyloss function |
| spellingShingle | Jinlong Chen Kejian Xu Yi Ning Zhi Xu Region Boosting for Real-Time Object Detection Using Multi-Dimensional Attention IEEE Access Real-time object detection attention mechanism region boosting label assignment strategy loss function |
| title | Region Boosting for Real-Time Object Detection Using Multi-Dimensional Attention |
| title_full | Region Boosting for Real-Time Object Detection Using Multi-Dimensional Attention |
| title_fullStr | Region Boosting for Real-Time Object Detection Using Multi-Dimensional Attention |
| title_full_unstemmed | Region Boosting for Real-Time Object Detection Using Multi-Dimensional Attention |
| title_short | Region Boosting for Real-Time Object Detection Using Multi-Dimensional Attention |
| title_sort | region boosting for real time object detection using multi dimensional attention |
| topic | Real-time object detection attention mechanism region boosting label assignment strategy loss function |
| url | https://ieeexplore.ieee.org/document/10745475/ |
| work_keys_str_mv | AT jinlongchen regionboostingforrealtimeobjectdetectionusingmultidimensionalattention AT kejianxu regionboostingforrealtimeobjectdetectionusingmultidimensionalattention AT yining regionboostingforrealtimeobjectdetectionusingmultidimensionalattention AT zhixu regionboostingforrealtimeobjectdetectionusingmultidimensionalattention |