6D Object Pose Estimation With Compact Generalized Non-Local Operation
Real-time object detection and pose estimation are critical in practical applications such as virtual reality, scene understanding, and robotics. In this paper, we propose a compact generalized non-local pose estimation network capable of directly predicting the projection of an object’s...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10771728/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846129603130687488 |
|---|---|
| author | Changhong Jiang Xiaoqiao Mu Bingbing Zhang Chao Liang Mujun Xie |
| author_facet | Changhong Jiang Xiaoqiao Mu Bingbing Zhang Chao Liang Mujun Xie |
| author_sort | Changhong Jiang |
| collection | DOAJ |
| description | Real-time object detection and pose estimation are critical in practical applications such as virtual reality, scene understanding, and robotics. In this paper, we propose a compact generalized non-local pose estimation network capable of directly predicting the projection of an object’s 3D bounding box vertices onto a 2D image, facilitating the estimation of the object’s 6D pose. The network is constructed using the YOLOv5 model, with the integration of an improved non-local module termed the Compact Generalized Non-local Block. This module enhances feature representation by learning the correlations between the positions of all elements across channels, effectively capturing subtle feature cues. The proposed network is end-to-end trainable, producing accurate pose predictions without the need for any post-processing operations. Extensive validation on the LineMod dataset shows that our approach achieves a final accuracy of 46.1% on the average 3D distance of model vertices (ADD) metric, outperforming existing methods by 6.9% and our baseline model by 1.8%, thus underscoring the efficacy of the proposed network. |
| format | Article |
| id | doaj-art-4abda70ec3204d2987fb4961b9427e16 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-4abda70ec3204d2987fb4961b9427e162024-12-10T00:02:18ZengIEEEIEEE Access2169-35362024-01-011217808017808810.1109/ACCESS.2024.3508772107717286D Object Pose Estimation With Compact Generalized Non-Local OperationChanghong Jiang0https://orcid.org/0000-0001-9646-6179Xiaoqiao Mu1https://orcid.org/0009-0009-3127-1157Bingbing Zhang2https://orcid.org/0000-0002-4734-4164Chao Liang3https://orcid.org/0009-0001-6084-6900Mujun Xie4https://orcid.org/0000-0002-4984-6504School of Electrical and Electronic Engineering, Changchun University of Technology, Changchun, ChinaSchool of Mechanical and Electrical Engineering, Changchun University of Technology, Changchun, ChinaSchool of Computer Science and Engineering, Dalian Minzu University, Dalian, ChinaCollage of Computer Science and Engineering, Changchun University of Technology, Changchun, ChinaSchool of Electrical and Electronic Engineering, Changchun University of Technology, Changchun, ChinaReal-time object detection and pose estimation are critical in practical applications such as virtual reality, scene understanding, and robotics. In this paper, we propose a compact generalized non-local pose estimation network capable of directly predicting the projection of an object’s 3D bounding box vertices onto a 2D image, facilitating the estimation of the object’s 6D pose. The network is constructed using the YOLOv5 model, with the integration of an improved non-local module termed the Compact Generalized Non-local Block. This module enhances feature representation by learning the correlations between the positions of all elements across channels, effectively capturing subtle feature cues. The proposed network is end-to-end trainable, producing accurate pose predictions without the need for any post-processing operations. Extensive validation on the LineMod dataset shows that our approach achieves a final accuracy of 46.1% on the average 3D distance of model vertices (ADD) metric, outperforming existing methods by 6.9% and our baseline model by 1.8%, thus underscoring the efficacy of the proposed network.https://ieeexplore.ieee.org/document/10771728/Correlationssubtle featureend-to-endlong-range spatiotemporalfine-grained detailsrepresentational power |
| spellingShingle | Changhong Jiang Xiaoqiao Mu Bingbing Zhang Chao Liang Mujun Xie 6D Object Pose Estimation With Compact Generalized Non-Local Operation IEEE Access Correlations subtle feature end-to-end long-range spatiotemporal fine-grained details representational power |
| title | 6D Object Pose Estimation With Compact Generalized Non-Local Operation |
| title_full | 6D Object Pose Estimation With Compact Generalized Non-Local Operation |
| title_fullStr | 6D Object Pose Estimation With Compact Generalized Non-Local Operation |
| title_full_unstemmed | 6D Object Pose Estimation With Compact Generalized Non-Local Operation |
| title_short | 6D Object Pose Estimation With Compact Generalized Non-Local Operation |
| title_sort | 6d object pose estimation with compact generalized non local operation |
| topic | Correlations subtle feature end-to-end long-range spatiotemporal fine-grained details representational power |
| url | https://ieeexplore.ieee.org/document/10771728/ |
| work_keys_str_mv | AT changhongjiang 6dobjectposeestimationwithcompactgeneralizednonlocaloperation AT xiaoqiaomu 6dobjectposeestimationwithcompactgeneralizednonlocaloperation AT bingbingzhang 6dobjectposeestimationwithcompactgeneralizednonlocaloperation AT chaoliang 6dobjectposeestimationwithcompactgeneralizednonlocaloperation AT mujunxie 6dobjectposeestimationwithcompactgeneralizednonlocaloperation |