LATrack: Limited Attention for Visual Object Tracking

The use of temporal information is becoming increasingly important in mainstream visual object trackers. Mainstream trackers typically interact trajectory information with image features. However, existing methods of interaction cannot effectively utilize trajectory information, and the significance...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jian Shi, Zheng Chang, Yang Yu, Junze Shi, Haibo Luo
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Limited attention temporal prompt vision transformer visual object tracking
Online Access:	https://ieeexplore.ieee.org/document/10820357/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841550772431486976
author	Jian Shi Zheng Chang Yang Yu Junze Shi Haibo Luo
author_facet	Jian Shi Zheng Chang Yang Yu Junze Shi Haibo Luo
author_sort	Jian Shi
collection	DOAJ
description	The use of temporal information is becoming increasingly important in mainstream visual object trackers. Mainstream trackers typically interact trajectory information with image features. However, existing methods of interaction cannot effectively utilize trajectory information, and the significance of the interaction remains unclear. To address these issues, we propose a Limited Attention module (LA module). The LA module more effectively utilizes image features by masking certain image features based on historical trajectory information or prediction information. Based on the LA module, we propose Limited Attention Track (LATrack), which can make more effective use of trajectory information. LATrack can continuously approach the target object by utilizing the predicted coordinates of the historical trajectory, thereby obtaining the object’s position in the current frame. Our model excels in handling challenges such as motion blur, distraction from similar objects, and occlusion. LATrack demonstrates excellent performance across multiple datasets, notably achieving 76.7% AO and 53.8% AUC on the GOT-10k and <inline-formula> <tex-math notation="LaTeX">$\mathrm {LaSOT_{ext}}$ </tex-math></inline-formula> datasets, respectively.
format	Article
id	doaj-art-6f1e2b94b91d4c2e89e8249a6e19f9d7
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-6f1e2b94b91d4c2e89e8249a6e19f9d72025-01-10T00:01:15ZengIEEEIEEE Access2169-35362025-01-01134034404710.1109/ACCESS.2024.352501610820357LATrack: Limited Attention for Visual Object TrackingJian Shi0https://orcid.org/0009-0000-3858-8864Zheng Chang1https://orcid.org/0000-0001-7705-0194Yang Yu2Junze Shi3https://orcid.org/0009-0006-5028-2749Haibo Luo4https://orcid.org/0000-0001-6425-6433Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, ChinaChinese Academy of Sciences, Shenyang Institute of Automation, Shenyang, ChinaKey Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, ChinaKey Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, ChinaKey Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, ChinaThe use of temporal information is becoming increasingly important in mainstream visual object trackers. Mainstream trackers typically interact trajectory information with image features. However, existing methods of interaction cannot effectively utilize trajectory information, and the significance of the interaction remains unclear. To address these issues, we propose a Limited Attention module (LA module). The LA module more effectively utilizes image features by masking certain image features based on historical trajectory information or prediction information. Based on the LA module, we propose Limited Attention Track (LATrack), which can make more effective use of trajectory information. LATrack can continuously approach the target object by utilizing the predicted coordinates of the historical trajectory, thereby obtaining the object’s position in the current frame. Our model excels in handling challenges such as motion blur, distraction from similar objects, and occlusion. LATrack demonstrates excellent performance across multiple datasets, notably achieving 76.7% AO and 53.8% AUC on the GOT-10k and <inline-formula> <tex-math notation="LaTeX">$\mathrm {LaSOT_{ext}}$ </tex-math></inline-formula> datasets, respectively.https://ieeexplore.ieee.org/document/10820357/Limited attentiontemporal promptvision transformervisual object tracking
spellingShingle	Jian Shi Zheng Chang Yang Yu Junze Shi Haibo Luo LATrack: Limited Attention for Visual Object Tracking IEEE Access Limited attention temporal prompt vision transformer visual object tracking
title	LATrack: Limited Attention for Visual Object Tracking
title_full	LATrack: Limited Attention for Visual Object Tracking
title_fullStr	LATrack: Limited Attention for Visual Object Tracking
title_full_unstemmed	LATrack: Limited Attention for Visual Object Tracking
title_short	LATrack: Limited Attention for Visual Object Tracking
title_sort	latrack limited attention for visual object tracking
topic	Limited attention temporal prompt vision transformer visual object tracking
url	https://ieeexplore.ieee.org/document/10820357/
work_keys_str_mv	AT jianshi latracklimitedattentionforvisualobjecttracking AT zhengchang latracklimitedattentionforvisualobjecttracking AT yangyu latracklimitedattentionforvisualobjecttracking AT junzeshi latracklimitedattentionforvisualobjecttracking AT haiboluo latracklimitedattentionforvisualobjecttracking

LATrack: Limited Attention for Visual Object Tracking

Similar Items