GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation
Six degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, su...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/25/1/168 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841548894331207680 |
---|---|
author | Shuai Lin Junhui Yu Peng Su Weitao Xue Yang Qin Lina Fu Jing Wen Hong Huang |
author_facet | Shuai Lin Junhui Yu Peng Su Weitao Xue Yang Qin Lina Fu Jing Wen Hong Huang |
author_sort | Shuai Lin |
collection | DOAJ |
description | Six degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, such as depth data, to derive valuable geometric characteristics. However, the challenge of deep neural networks inadequately extracting features from object regions in RGB images remains. To overcome these limitations, we introduce the Geometry-Focused Attention Network (GFA-Net), a novel framework designed for more comprehensive feature extraction by analyzing critical geometric and textural object characteristics. GFA-Net leverages Point-wise Feature Attention (PFA) to capture subtle pose differences, guiding the network to localize object regions and identify point-wise discrepancies as pose shifts. In addition, a Geometry Feature Aggregation Module (GFAM) integrates multi-scale geometric feature maps to distill crucial geometric features. Then, the resulting dense 2D–3D correspondences are passed to a Perspective-n-Point (PnP) module for 6-DoF pose computation. Experimental results on the LINEMOD and Occlusion LINEMOD datasets indicate that our proposed method is highly competitive with state-of-the-art approaches, achieving 96.54% and 49.35% accuracy, respectively, utilizing the ADD-S metric with a 0.10d threshold. |
format | Article |
id | doaj-art-054dcf1b38cc4709a0f33ad20ac6fbb0 |
institution | Kabale University |
issn | 1424-8220 |
language | English |
publishDate | 2024-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj-art-054dcf1b38cc4709a0f33ad20ac6fbb02025-01-10T13:21:06ZengMDPI AGSensors1424-82202024-12-0125116810.3390/s25010168GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose EstimationShuai Lin0Junhui Yu1Peng Su2Weitao Xue3Yang Qin4Lina Fu5Jing Wen6Hong Huang7Shandong Non-Metallic Materials Institute, Jinan 250031, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaBeijing Institute of Space Machinery and Electronics, Product Testing Center, Beijing 100094, ChinaBeijing Institute of Space Machinery and Electronics, Product Testing Center, Beijing 100094, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaKey Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, ChinaSix degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, such as depth data, to derive valuable geometric characteristics. However, the challenge of deep neural networks inadequately extracting features from object regions in RGB images remains. To overcome these limitations, we introduce the Geometry-Focused Attention Network (GFA-Net), a novel framework designed for more comprehensive feature extraction by analyzing critical geometric and textural object characteristics. GFA-Net leverages Point-wise Feature Attention (PFA) to capture subtle pose differences, guiding the network to localize object regions and identify point-wise discrepancies as pose shifts. In addition, a Geometry Feature Aggregation Module (GFAM) integrates multi-scale geometric feature maps to distill crucial geometric features. Then, the resulting dense 2D–3D correspondences are passed to a Perspective-n-Point (PnP) module for 6-DoF pose computation. Experimental results on the LINEMOD and Occlusion LINEMOD datasets indicate that our proposed method is highly competitive with state-of-the-art approaches, achieving 96.54% and 49.35% accuracy, respectively, utilizing the ADD-S metric with a 0.10d threshold.https://www.mdpi.com/1424-8220/25/1/168pose estimationRGB imagedeep learninggeometric featuredense correspondences |
spellingShingle | Shuai Lin Junhui Yu Peng Su Weitao Xue Yang Qin Lina Fu Jing Wen Hong Huang GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation Sensors pose estimation RGB image deep learning geometric feature dense correspondences |
title | GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation |
title_full | GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation |
title_fullStr | GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation |
title_full_unstemmed | GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation |
title_short | GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation |
title_sort | gfa net geometry focused attention network for six degrees of freedom object pose estimation |
topic | pose estimation RGB image deep learning geometric feature dense correspondences |
url | https://www.mdpi.com/1424-8220/25/1/168 |
work_keys_str_mv | AT shuailin gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation AT junhuiyu gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation AT pengsu gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation AT weitaoxue gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation AT yangqin gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation AT linafu gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation AT jingwen gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation AT honghuang gfanetgeometryfocusedattentionnetworkforsixdegreesoffreedomobjectposeestimation |