6D Object Pose Estimation With Compact Generalized Non-Local Operation

Real-time object detection and pose estimation are critical in practical applications such as virtual reality, scene understanding, and robotics. In this paper, we propose a compact generalized non-local pose estimation network capable of directly predicting the projection of an object’s...

Full description

Saved in:

Bibliographic Details
Main Authors:	Changhong Jiang, Xiaoqiao Mu, Bingbing Zhang, Chao Liang, Mujun Xie
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Correlations subtle feature end-to-end long-range spatiotemporal fine-grained details representational power
Online Access:	https://ieeexplore.ieee.org/document/10771728/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846129603130687488
author	Changhong Jiang Xiaoqiao Mu Bingbing Zhang Chao Liang Mujun Xie
author_facet	Changhong Jiang Xiaoqiao Mu Bingbing Zhang Chao Liang Mujun Xie
author_sort	Changhong Jiang
collection	DOAJ
description	Real-time object detection and pose estimation are critical in practical applications such as virtual reality, scene understanding, and robotics. In this paper, we propose a compact generalized non-local pose estimation network capable of directly predicting the projection of an object’s 3D bounding box vertices onto a 2D image, facilitating the estimation of the object’s 6D pose. The network is constructed using the YOLOv5 model, with the integration of an improved non-local module termed the Compact Generalized Non-local Block. This module enhances feature representation by learning the correlations between the positions of all elements across channels, effectively capturing subtle feature cues. The proposed network is end-to-end trainable, producing accurate pose predictions without the need for any post-processing operations. Extensive validation on the LineMod dataset shows that our approach achieves a final accuracy of 46.1% on the average 3D distance of model vertices (ADD) metric, outperforming existing methods by 6.9% and our baseline model by 1.8%, thus underscoring the efficacy of the proposed network.
format	Article
id	doaj-art-4abda70ec3204d2987fb4961b9427e16
institution	Kabale University
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-4abda70ec3204d2987fb4961b9427e162024-12-10T00:02:18ZengIEEEIEEE Access2169-35362024-01-011217808017808810.1109/ACCESS.2024.3508772107717286D Object Pose Estimation With Compact Generalized Non-Local OperationChanghong Jiang0https://orcid.org/0000-0001-9646-6179Xiaoqiao Mu1https://orcid.org/0009-0009-3127-1157Bingbing Zhang2https://orcid.org/0000-0002-4734-4164Chao Liang3https://orcid.org/0009-0001-6084-6900Mujun Xie4https://orcid.org/0000-0002-4984-6504School of Electrical and Electronic Engineering, Changchun University of Technology, Changchun, ChinaSchool of Mechanical and Electrical Engineering, Changchun University of Technology, Changchun, ChinaSchool of Computer Science and Engineering, Dalian Minzu University, Dalian, ChinaCollage of Computer Science and Engineering, Changchun University of Technology, Changchun, ChinaSchool of Electrical and Electronic Engineering, Changchun University of Technology, Changchun, ChinaReal-time object detection and pose estimation are critical in practical applications such as virtual reality, scene understanding, and robotics. In this paper, we propose a compact generalized non-local pose estimation network capable of directly predicting the projection of an object’s 3D bounding box vertices onto a 2D image, facilitating the estimation of the object’s 6D pose. The network is constructed using the YOLOv5 model, with the integration of an improved non-local module termed the Compact Generalized Non-local Block. This module enhances feature representation by learning the correlations between the positions of all elements across channels, effectively capturing subtle feature cues. The proposed network is end-to-end trainable, producing accurate pose predictions without the need for any post-processing operations. Extensive validation on the LineMod dataset shows that our approach achieves a final accuracy of 46.1% on the average 3D distance of model vertices (ADD) metric, outperforming existing methods by 6.9% and our baseline model by 1.8%, thus underscoring the efficacy of the proposed network.https://ieeexplore.ieee.org/document/10771728/Correlationssubtle featureend-to-endlong-range spatiotemporalfine-grained detailsrepresentational power
spellingShingle	Changhong Jiang Xiaoqiao Mu Bingbing Zhang Chao Liang Mujun Xie 6D Object Pose Estimation With Compact Generalized Non-Local Operation IEEE Access Correlations subtle feature end-to-end long-range spatiotemporal fine-grained details representational power
title	6D Object Pose Estimation With Compact Generalized Non-Local Operation
title_full	6D Object Pose Estimation With Compact Generalized Non-Local Operation
title_fullStr	6D Object Pose Estimation With Compact Generalized Non-Local Operation
title_full_unstemmed	6D Object Pose Estimation With Compact Generalized Non-Local Operation
title_short	6D Object Pose Estimation With Compact Generalized Non-Local Operation
title_sort	6d object pose estimation with compact generalized non local operation
topic	Correlations subtle feature end-to-end long-range spatiotemporal fine-grained details representational power
url	https://ieeexplore.ieee.org/document/10771728/
work_keys_str_mv	AT changhongjiang 6dobjectposeestimationwithcompactgeneralizednonlocaloperation AT xiaoqiaomu 6dobjectposeestimationwithcompactgeneralizednonlocaloperation AT bingbingzhang 6dobjectposeestimationwithcompactgeneralizednonlocaloperation AT chaoliang 6dobjectposeestimationwithcompactgeneralizednonlocaloperation AT mujunxie 6dobjectposeestimationwithcompactgeneralizednonlocaloperation

6D Object Pose Estimation With Compact Generalized Non-Local Operation

Similar Items