GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation

Six degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, su...

Full description

Saved in:
Bibliographic Details
Main Authors: Shuai Lin, Junhui Yu, Peng Su, Weitao Xue, Yang Qin, Lina Fu, Jing Wen, Hong Huang
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/1/168
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841548894331207680
author Shuai Lin
Junhui Yu
Peng Su
Weitao Xue
Yang Qin
Lina Fu
Jing Wen
Hong Huang
author_facet Shuai Lin
Junhui Yu
Peng Su
Weitao Xue
Yang Qin
Lina Fu
Jing Wen
Hong Huang
author_sort Shuai Lin
collection DOAJ
description Six degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, such as depth data, to derive valuable geometric characteristics. However, the challenge of deep neural networks inadequately extracting features from object regions in RGB images remains. To overcome these limitations, we introduce the Geometry-Focused Attention Network (GFA-Net), a novel framework designed for more comprehensive feature extraction by analyzing critical geometric and textural object characteristics. GFA-Net leverages Point-wise Feature Attention (PFA) to capture subtle pose differences, guiding the network to localize object regions and identify point-wise discrepancies as pose shifts. In addition, a Geometry Feature Aggregation Module (GFAM) integrates multi-scale geometric feature maps to distill crucial geometric features. Then, the resulting dense 2D–3D correspondences are passed to a Perspective-n-Point (PnP) module for 6-DoF pose computation. Experimental results on the LINEMOD and Occlusion LINEMOD datasets indicate that our proposed method is highly competitive with state-of-the-art approaches, achieving 96.54% and 49.35% accuracy, respectively, utilizing the ADD-S metric with a 0.10d threshold.
format Article
id doaj-art-054dcf1b38cc4709a0f33ad20ac6fbb0
institution Kabale University
issn 1424-8220
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-054dcf1b38cc4709a0f33ad20ac6fbb02025-01-10T13:21:06ZengMDPI AGSensors1424-82202024-12-0125116810.3390/s25010168GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose EstimationShuai Lin0Junhui Yu1Peng Su2Weitao Xue3Yang Qin4Lina Fu5Jing Wen6Hong Huang7Shandong Non-Metallic Materials Institute, Jinan 250031, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaBeijing Institute of Space Machinery and Electronics, Product Testing Center, Beijing 100094, ChinaBeijing Institute of Space Machinery and Electronics, Product Testing Center, Beijing 100094, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaSix degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, such as depth data, to derive valuable geometric characteristics. However, the challenge of deep neural networks inadequately extracting features from object regions in RGB images remains. To overcome these limitations, we introduce the Geometry-Focused Attention Network (GFA-Net), a novel framework designed for more comprehensive feature extraction by analyzing critical geometric and textural object characteristics. GFA-Net leverages Point-wise Feature Attention (PFA) to capture subtle pose differences, guiding the network to localize object regions and identify point-wise discrepancies as pose shifts. In addition, a Geometry Feature Aggregation Module (GFAM) integrates multi-scale geometric feature maps to distill crucial geometric features. Then, the resulting dense 2D–3D correspondences are passed to a Perspective-n-Point (PnP) module for 6-DoF pose computation. Experimental results on the LINEMOD and Occlusion LINEMOD datasets indicate that our proposed method is highly competitive with state-of-the-art approaches, achieving 96.54% and 49.35% accuracy, respectively, utilizing the ADD-S metric with a 0.10d threshold.https://www.mdpi.com/1424-8220/25/1/168pose estimationRGB imagedeep learninggeometric featuredense correspondences
spellingShingle Shuai Lin
Junhui Yu
Peng Su
Weitao Xue
Yang Qin
Lina Fu
Jing Wen
Hong Huang
GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation
Sensors
pose estimation
RGB image
deep learning
geometric feature
dense correspondences
title GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation
title_full GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation
title_fullStr GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation
title_full_unstemmed GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation
title_short GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation
title_sort gfa net geometry focused attention network for six degrees of freedom object pose estimation
topic pose estimation
RGB image
deep learning
geometric feature
dense correspondences
url https://www.mdpi.com/1424-8220/25/1/168
work_keys_str_mv AT shuailin gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation
AT junhuiyu gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation
AT pengsu gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation
AT weitaoxue gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation
AT yangqin gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation
AT linafu gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation
AT jingwen gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation
AT honghuang gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation