Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic Manipulation

Humans typically make decisions based on past experiences and observations, while in the field of robotic manipulation, the robot’s action prediction often relies solely on current observations, which tends to make robots overlook environmental changes or become ineffective when current observations...

Full description

Saved in:
Bibliographic Details
Main Authors: Gaoxiong Lu, Zeyu Yan, Jianing Luo, Wei Li
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Biomimetics
Subjects:
Online Access:https://www.mdpi.com/2313-7673/9/11/712
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846154168846254080
author Gaoxiong Lu
Zeyu Yan
Jianing Luo
Wei Li
author_facet Gaoxiong Lu
Zeyu Yan
Jianing Luo
Wei Li
author_sort Gaoxiong Lu
collection DOAJ
description Humans typically make decisions based on past experiences and observations, while in the field of robotic manipulation, the robot’s action prediction often relies solely on current observations, which tends to make robots overlook environmental changes or become ineffective when current observations are suboptimal. To address this pivotal challenge in robotics, inspired by human cognitive processes, we propose our method which integrates historical learning and multi-view attention to improve the performance of robotic manipulation. Based on a spatio-temporal attention mechanism, our method not only combines observations from current and past steps but also integrates historical actions to better perceive changes in robots’ behaviours and their impacts on the environment. We also employ a mutual information-based multi-view attention module to automatically focus on valuable perspectives, thereby incorporating more effective information for decision-making. Furthermore, inspired by human visual system which processes both global context and local texture details, we have devised a method that merges semantic and texture features, aiding robots in understanding the task and enhancing their capability to handle fine-grained tasks. Extensive experiments in RLBench and real-world scenarios demonstrate that our method effectively handles various tasks and exhibits notable robustness and adaptability.
format Article
id doaj-art-52c9548ad19e4438960a2512e2dfd859
institution Kabale University
issn 2313-7673
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Biomimetics
spelling doaj-art-52c9548ad19e4438960a2512e2dfd8592024-11-26T17:53:50ZengMDPI AGBiomimetics2313-76732024-11-0191171210.3390/biomimetics9110712Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic ManipulationGaoxiong Lu0Zeyu Yan1Jianing Luo2Wei Li3The Academy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaThe Academy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaThe Academy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaThe Academy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaHumans typically make decisions based on past experiences and observations, while in the field of robotic manipulation, the robot’s action prediction often relies solely on current observations, which tends to make robots overlook environmental changes or become ineffective when current observations are suboptimal. To address this pivotal challenge in robotics, inspired by human cognitive processes, we propose our method which integrates historical learning and multi-view attention to improve the performance of robotic manipulation. Based on a spatio-temporal attention mechanism, our method not only combines observations from current and past steps but also integrates historical actions to better perceive changes in robots’ behaviours and their impacts on the environment. We also employ a mutual information-based multi-view attention module to automatically focus on valuable perspectives, thereby incorporating more effective information for decision-making. Furthermore, inspired by human visual system which processes both global context and local texture details, we have devised a method that merges semantic and texture features, aiding robots in understanding the task and enhancing their capability to handle fine-grained tasks. Extensive experiments in RLBench and real-world scenarios demonstrate that our method effectively handles various tasks and exhibits notable robustness and adaptability.https://www.mdpi.com/2313-7673/9/11/712robotic manipulationhistorical informationmulti-view attentionhierarchical visual representations
spellingShingle Gaoxiong Lu
Zeyu Yan
Jianing Luo
Wei Li
Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic Manipulation
Biomimetics
robotic manipulation
historical information
multi-view attention
hierarchical visual representations
title Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic Manipulation
title_full Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic Manipulation
title_fullStr Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic Manipulation
title_full_unstemmed Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic Manipulation
title_short Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic Manipulation
title_sort integrating historical learning and multi view attention with hierarchical feature fusion for robotic manipulation
topic robotic manipulation
historical information
multi-view attention
hierarchical visual representations
url https://www.mdpi.com/2313-7673/9/11/712
work_keys_str_mv AT gaoxionglu integratinghistoricallearningandmultiviewattentionwithhierarchicalfeaturefusionforroboticmanipulation
AT zeyuyan integratinghistoricallearningandmultiviewattentionwithhierarchicalfeaturefusionforroboticmanipulation
AT jianingluo integratinghistoricallearningandmultiviewattentionwithhierarchicalfeaturefusionforroboticmanipulation
AT weili integratinghistoricallearningandmultiviewattentionwithhierarchicalfeaturefusionforroboticmanipulation