GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation

Six degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, su...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shuai Lin, Junhui Yu, Peng Su, Weitao Xue, Yang Qin, Lina Fu, Jing Wen, Hong Huang
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Sensors
Subjects:	pose estimation RGB image deep learning geometric feature dense correspondences
Online Access:	https://www.mdpi.com/1424-8220/25/1/168
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Six degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, such as depth data, to derive valuable geometric characteristics. However, the challenge of deep neural networks inadequately extracting features from object regions in RGB images remains. To overcome these limitations, we introduce the Geometry-Focused Attention Network (GFA-Net), a novel framework designed for more comprehensive feature extraction by analyzing critical geometric and textural object characteristics. GFA-Net leverages Point-wise Feature Attention (PFA) to capture subtle pose differences, guiding the network to localize object regions and identify point-wise discrepancies as pose shifts. In addition, a Geometry Feature Aggregation Module (GFAM) integrates multi-scale geometric feature maps to distill crucial geometric features. Then, the resulting dense 2D–3D correspondences are passed to a Perspective-n-Point (PnP) module for 6-DoF pose computation. Experimental results on the LINEMOD and Occlusion LINEMOD datasets indicate that our proposed method is highly competitive with state-of-the-art approaches, achieving 96.54% and 49.35% accuracy, respectively, utilizing the ADD-S metric with a 0.10d threshold.
ISSN:	1424-8220

GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation

Similar Items