Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation

Fine-grained image classification is a notable challenge in the field of computer vision. The primary influencing factor is that similar images often have different labels, meaning there is high inter-class similarity and low intra-class similarity. An increasing number of fine-grained classificatio...

Full description

Saved in:
Bibliographic Details
Main Authors: Min Huang, Ke Li, Xiaoyan Yu, Chen Yang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10577094/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846113864118173696
author Min Huang
Ke Li
Xiaoyan Yu
Chen Yang
author_facet Min Huang
Ke Li
Xiaoyan Yu
Chen Yang
author_sort Min Huang
collection DOAJ
description Fine-grained image classification is a notable challenge in the field of computer vision. The primary influencing factor is that similar images often have different labels, meaning there is high inter-class similarity and low intra-class similarity. An increasing number of fine-grained classification models utilize attention mechanisms to extract distinguishable regions to address this issue, yet they overlook other equally distinguishable but less obvious features. Moreover, these mechanisms typically enhance features in only one dimension while neglecting those in another. Additionally, there is a lack of rational use of features extracted from intermediate layers. To tackle these problems, we propose a fine-grained visual classification model based on dual attention feature supplementation. This model obtains dual-dimensional enhanced features through cross-attention in two dimensions and allows the network to explore other potential discriminative areas by suppressing the enhanced features. Furthermore, a feature pyramid approach is employed to acquire multi-scale features, and an outer product is used to explore relationships among feature components, enhancing the utilization of intermediate layer features and the learning of refined characteristics. Empirical evidence from experiments proves that our method does not require additional annotations beyond image labels and has achieved satisfactory performance on several public benchmark fine-grained datasets.
format Article
id doaj-art-6478247bec3b4e6a8dc7fdf889e1fe8a
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-6478247bec3b4e6a8dc7fdf889e1fe8a2024-12-21T00:00:58ZengIEEEIEEE Access2169-35362024-01-011219220919221810.1109/ACCESS.2024.342042910577094Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature ComplementationMin Huang0https://orcid.org/0000-0003-2744-0455Ke Li1https://orcid.org/0009-0001-7193-2337Xiaoyan Yu2https://orcid.org/0009-0002-7798-7942Chen Yang3https://orcid.org/0009-0006-2501-1808College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, ChinaCollege of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, ChinaCollege of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, ChinaCollege of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, ChinaFine-grained image classification is a notable challenge in the field of computer vision. The primary influencing factor is that similar images often have different labels, meaning there is high inter-class similarity and low intra-class similarity. An increasing number of fine-grained classification models utilize attention mechanisms to extract distinguishable regions to address this issue, yet they overlook other equally distinguishable but less obvious features. Moreover, these mechanisms typically enhance features in only one dimension while neglecting those in another. Additionally, there is a lack of rational use of features extracted from intermediate layers. To tackle these problems, we propose a fine-grained visual classification model based on dual attention feature supplementation. This model obtains dual-dimensional enhanced features through cross-attention in two dimensions and allows the network to explore other potential discriminative areas by suppressing the enhanced features. Furthermore, a feature pyramid approach is employed to acquire multi-scale features, and an outer product is used to explore relationships among feature components, enhancing the utilization of intermediate layer features and the learning of refined characteristics. Empirical evidence from experiments proves that our method does not require additional annotations beyond image labels and has achieved satisfactory performance on several public benchmark fine-grained datasets.https://ieeexplore.ieee.org/document/10577094/Fine-grained imagesconvolutional neural networkdual attention
spellingShingle Min Huang
Ke Li
Xiaoyan Yu
Chen Yang
Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation
IEEE Access
Fine-grained images
convolutional neural network
dual attention
title Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation
title_full Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation
title_fullStr Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation
title_full_unstemmed Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation
title_short Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation
title_sort research on fine grained visual classification method based on dual attention feature complementation
topic Fine-grained images
convolutional neural network
dual attention
url https://ieeexplore.ieee.org/document/10577094/
work_keys_str_mv AT minhuang researchonfinegrainedvisualclassificationmethodbasedondualattentionfeaturecomplementation
AT keli researchonfinegrainedvisualclassificationmethodbasedondualattentionfeaturecomplementation
AT xiaoyanyu researchonfinegrainedvisualclassificationmethodbasedondualattentionfeaturecomplementation
AT chenyang researchonfinegrainedvisualclassificationmethodbasedondualattentionfeaturecomplementation