Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation
Fine-grained image classification is a notable challenge in the field of computer vision. The primary influencing factor is that similar images often have different labels, meaning there is high inter-class similarity and low intra-class similarity. An increasing number of fine-grained classificatio...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10577094/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846113864118173696 |
|---|---|
| author | Min Huang Ke Li Xiaoyan Yu Chen Yang |
| author_facet | Min Huang Ke Li Xiaoyan Yu Chen Yang |
| author_sort | Min Huang |
| collection | DOAJ |
| description | Fine-grained image classification is a notable challenge in the field of computer vision. The primary influencing factor is that similar images often have different labels, meaning there is high inter-class similarity and low intra-class similarity. An increasing number of fine-grained classification models utilize attention mechanisms to extract distinguishable regions to address this issue, yet they overlook other equally distinguishable but less obvious features. Moreover, these mechanisms typically enhance features in only one dimension while neglecting those in another. Additionally, there is a lack of rational use of features extracted from intermediate layers. To tackle these problems, we propose a fine-grained visual classification model based on dual attention feature supplementation. This model obtains dual-dimensional enhanced features through cross-attention in two dimensions and allows the network to explore other potential discriminative areas by suppressing the enhanced features. Furthermore, a feature pyramid approach is employed to acquire multi-scale features, and an outer product is used to explore relationships among feature components, enhancing the utilization of intermediate layer features and the learning of refined characteristics. Empirical evidence from experiments proves that our method does not require additional annotations beyond image labels and has achieved satisfactory performance on several public benchmark fine-grained datasets. |
| format | Article |
| id | doaj-art-6478247bec3b4e6a8dc7fdf889e1fe8a |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-6478247bec3b4e6a8dc7fdf889e1fe8a2024-12-21T00:00:58ZengIEEEIEEE Access2169-35362024-01-011219220919221810.1109/ACCESS.2024.342042910577094Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature ComplementationMin Huang0https://orcid.org/0000-0003-2744-0455Ke Li1https://orcid.org/0009-0001-7193-2337Xiaoyan Yu2https://orcid.org/0009-0002-7798-7942Chen Yang3https://orcid.org/0009-0006-2501-1808College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, ChinaCollege of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, ChinaCollege of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, ChinaCollege of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, ChinaFine-grained image classification is a notable challenge in the field of computer vision. The primary influencing factor is that similar images often have different labels, meaning there is high inter-class similarity and low intra-class similarity. An increasing number of fine-grained classification models utilize attention mechanisms to extract distinguishable regions to address this issue, yet they overlook other equally distinguishable but less obvious features. Moreover, these mechanisms typically enhance features in only one dimension while neglecting those in another. Additionally, there is a lack of rational use of features extracted from intermediate layers. To tackle these problems, we propose a fine-grained visual classification model based on dual attention feature supplementation. This model obtains dual-dimensional enhanced features through cross-attention in two dimensions and allows the network to explore other potential discriminative areas by suppressing the enhanced features. Furthermore, a feature pyramid approach is employed to acquire multi-scale features, and an outer product is used to explore relationships among feature components, enhancing the utilization of intermediate layer features and the learning of refined characteristics. Empirical evidence from experiments proves that our method does not require additional annotations beyond image labels and has achieved satisfactory performance on several public benchmark fine-grained datasets.https://ieeexplore.ieee.org/document/10577094/Fine-grained imagesconvolutional neural networkdual attention |
| spellingShingle | Min Huang Ke Li Xiaoyan Yu Chen Yang Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation IEEE Access Fine-grained images convolutional neural network dual attention |
| title | Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation |
| title_full | Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation |
| title_fullStr | Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation |
| title_full_unstemmed | Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation |
| title_short | Research on Fine-Grained Visual Classification Method Based on Dual-Attention Feature Complementation |
| title_sort | research on fine grained visual classification method based on dual attention feature complementation |
| topic | Fine-grained images convolutional neural network dual attention |
| url | https://ieeexplore.ieee.org/document/10577094/ |
| work_keys_str_mv | AT minhuang researchonfinegrainedvisualclassificationmethodbasedondualattentionfeaturecomplementation AT keli researchonfinegrainedvisualclassificationmethodbasedondualattentionfeaturecomplementation AT xiaoyanyu researchonfinegrainedvisualclassificationmethodbasedondualattentionfeaturecomplementation AT chenyang researchonfinegrainedvisualclassificationmethodbasedondualattentionfeaturecomplementation |