LATrack: Limited Attention for Visual Object Tracking

The use of temporal information is becoming increasingly important in mainstream visual object trackers. Mainstream trackers typically interact trajectory information with image features. However, existing methods of interaction cannot effectively utilize trajectory information, and the significance...

Full description

Saved in:
Bibliographic Details
Main Authors: Jian Shi, Zheng Chang, Yang Yu, Junze Shi, Haibo Luo
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10820357/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841550772431486976
author Jian Shi
Zheng Chang
Yang Yu
Junze Shi
Haibo Luo
author_facet Jian Shi
Zheng Chang
Yang Yu
Junze Shi
Haibo Luo
author_sort Jian Shi
collection DOAJ
description The use of temporal information is becoming increasingly important in mainstream visual object trackers. Mainstream trackers typically interact trajectory information with image features. However, existing methods of interaction cannot effectively utilize trajectory information, and the significance of the interaction remains unclear. To address these issues, we propose a Limited Attention module (LA module). The LA module more effectively utilizes image features by masking certain image features based on historical trajectory information or prediction information. Based on the LA module, we propose Limited Attention Track (LATrack), which can make more effective use of trajectory information. LATrack can continuously approach the target object by utilizing the predicted coordinates of the historical trajectory, thereby obtaining the object&#x2019;s position in the current frame. Our model excels in handling challenges such as motion blur, distraction from similar objects, and occlusion. LATrack demonstrates excellent performance across multiple datasets, notably achieving 76.7% AO and 53.8% AUC on the GOT-10k and <inline-formula> <tex-math notation="LaTeX">$\mathrm {LaSOT_{ext}}$ </tex-math></inline-formula> datasets, respectively.
format Article
id doaj-art-6f1e2b94b91d4c2e89e8249a6e19f9d7
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-6f1e2b94b91d4c2e89e8249a6e19f9d72025-01-10T00:01:15ZengIEEEIEEE Access2169-35362025-01-01134034404710.1109/ACCESS.2024.352501610820357LATrack: Limited Attention for Visual Object TrackingJian Shi0https://orcid.org/0009-0000-3858-8864Zheng Chang1https://orcid.org/0000-0001-7705-0194Yang Yu2Junze Shi3https://orcid.org/0009-0006-5028-2749Haibo Luo4https://orcid.org/0000-0001-6425-6433Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, ChinaChinese Academy of Sciences, Shenyang Institute of Automation, Shenyang, ChinaKey Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, ChinaKey Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, ChinaKey Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, ChinaThe use of temporal information is becoming increasingly important in mainstream visual object trackers. Mainstream trackers typically interact trajectory information with image features. However, existing methods of interaction cannot effectively utilize trajectory information, and the significance of the interaction remains unclear. To address these issues, we propose a Limited Attention module (LA module). The LA module more effectively utilizes image features by masking certain image features based on historical trajectory information or prediction information. Based on the LA module, we propose Limited Attention Track (LATrack), which can make more effective use of trajectory information. LATrack can continuously approach the target object by utilizing the predicted coordinates of the historical trajectory, thereby obtaining the object&#x2019;s position in the current frame. Our model excels in handling challenges such as motion blur, distraction from similar objects, and occlusion. LATrack demonstrates excellent performance across multiple datasets, notably achieving 76.7% AO and 53.8% AUC on the GOT-10k and <inline-formula> <tex-math notation="LaTeX">$\mathrm {LaSOT_{ext}}$ </tex-math></inline-formula> datasets, respectively.https://ieeexplore.ieee.org/document/10820357/Limited attentiontemporal promptvision transformervisual object tracking
spellingShingle Jian Shi
Zheng Chang
Yang Yu
Junze Shi
Haibo Luo
LATrack: Limited Attention for Visual Object Tracking
IEEE Access
Limited attention
temporal prompt
vision transformer
visual object tracking
title LATrack: Limited Attention for Visual Object Tracking
title_full LATrack: Limited Attention for Visual Object Tracking
title_fullStr LATrack: Limited Attention for Visual Object Tracking
title_full_unstemmed LATrack: Limited Attention for Visual Object Tracking
title_short LATrack: Limited Attention for Visual Object Tracking
title_sort latrack limited attention for visual object tracking
topic Limited attention
temporal prompt
vision transformer
visual object tracking
url https://ieeexplore.ieee.org/document/10820357/
work_keys_str_mv AT jianshi latracklimitedattentionforvisualobjecttracking
AT zhengchang latracklimitedattentionforvisualobjecttracking
AT yangyu latracklimitedattentionforvisualobjecttracking
AT junzeshi latracklimitedattentionforvisualobjecttracking
AT haiboluo latracklimitedattentionforvisualobjecttracking