Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint Samples

The 3-D swin transformer (3DST) and spatial–spectral transformer (SST) each excel in capturing distinct aspects of image information: the 3DST with hierarchical attention and window-based processing, and the SST with self-attention mechanisms for long-range dependencies. However, applying...

Full description

Saved in:

Bibliographic Details
Main Authors:	Muhammad Ahmad, Muhammad Usama, Manuel Mazzara, Salvatore Distefano, Hamad Ahmed Altuwaijri, Silvia Liberata Ullo
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:	3-D Swin Transformer (3DST) feature fusion hyperspectral image classification (HSIC) spatial–spectral features spatial–spectral transformer (SST)
Online Access:	https://ieeexplore.ieee.org/document/10685113/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846163869374873600
author	Muhammad Ahmad Muhammad Usama Manuel Mazzara Salvatore Distefano Hamad Ahmed Altuwaijri Silvia Liberata Ullo
author_facet	Muhammad Ahmad Muhammad Usama Manuel Mazzara Salvatore Distefano Hamad Ahmed Altuwaijri Silvia Liberata Ullo
author_sort	Muhammad Ahmad
collection	DOAJ
description	The 3-D swin transformer (3DST) and spatial–spectral transformer (SST) each excel in capturing distinct aspects of image information: the 3DST with hierarchical attention and window-based processing, and the SST with self-attention mechanisms for long-range dependencies. However, applying them independently reveals the following limitations: the 3DST struggles with spectral information, while the SST lacks in capturing fine spatial details. In this article, we propose a novel tuning fork fusion approach to overcome these shortcomings, integrating the 3DST and SST to enhance the hyperspectral image (HSI) classification (HSIC). Our method integrates the hierarchical attention mechanism from the 3DST with the long-range dependence modeling of the SST. This combination refines spatial and spectral information representation and merges insights from both transformers at a fine-grained level. By emphasizing the fusion of attention mechanisms from both architectures, our approach significantly enhances the model's ability to capture complex spatial–spectral relationships, resulting in improved HSIC accuracy. In addition, we highlight the importance of disjoint training, validation, and test samples to enhance model generalization. Experimentation on benchmark HSI datasets demonstrates the superiority of our fusion approach over other state-of-the-art methods and standalone transformers. The source code has been developed from scratch and will be made public upon acceptance.
format	Article
id	doaj-art-db3482e359d246c78013a2c7bebf6d7a
institution	Kabale University
issn	1939-1404 2151-1535
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling	doaj-art-db3482e359d246c78013a2c7bebf6d7a2024-11-19T00:00:55ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352024-01-0117181671818110.1109/JSTARS.2024.346583110685113Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint SamplesMuhammad Ahmad0https://orcid.org/0000-0002-3320-2261Muhammad Usama1https://orcid.org/0000-0001-5015-8605Manuel Mazzara2https://orcid.org/0000-0002-3860-4948Salvatore Distefano3Hamad Ahmed Altuwaijri4https://orcid.org/0000-0002-2604-5974Silvia Liberata Ullo5https://orcid.org/0000-0001-6294-0581Department of Computer Science, National University of Computer and Emerging Sciences Islamabad, Chiniot-Faisalabad Campus, Chiniot, PakistanDepartment of Computer Science, National University of Computer and Emerging Sciences Islamabad, Chiniot-Faisalabad Campus, Chiniot, PakistanInstitute of Software Development and Engineering, Innopolis University, Innopolis, RussiaDipartimento di Matematica e Informatica—MIFT, University of Messina, Messina, ItalyDepartment of Geography, College of Humanities and Social Sciences, King Saud University, Riyadh, Saudi ArabiaDepartment of Engineering, University of Sannio, Benevento, ItalyThe 3-D swin transformer (3DST) and spatial–spectral transformer (SST) each excel in capturing distinct aspects of image information: the 3DST with hierarchical attention and window-based processing, and the SST with self-attention mechanisms for long-range dependencies. However, applying them independently reveals the following limitations: the 3DST struggles with spectral information, while the SST lacks in capturing fine spatial details. In this article, we propose a novel tuning fork fusion approach to overcome these shortcomings, integrating the 3DST and SST to enhance the hyperspectral image (HSI) classification (HSIC). Our method integrates the hierarchical attention mechanism from the 3DST with the long-range dependence modeling of the SST. This combination refines spatial and spectral information representation and merges insights from both transformers at a fine-grained level. By emphasizing the fusion of attention mechanisms from both architectures, our approach significantly enhances the model's ability to capture complex spatial–spectral relationships, resulting in improved HSIC accuracy. In addition, we highlight the importance of disjoint training, validation, and test samples to enhance model generalization. Experimentation on benchmark HSI datasets demonstrates the superiority of our fusion approach over other state-of-the-art methods and standalone transformers. The source code has been developed from scratch and will be made public upon acceptance.https://ieeexplore.ieee.org/document/10685113/3-D Swin Transformer (3DST)feature fusionhyperspectral image classification (HSIC)spatial–spectral featuresspatial–spectral transformer (SST)
spellingShingle	Muhammad Ahmad Muhammad Usama Manuel Mazzara Salvatore Distefano Hamad Ahmed Altuwaijri Silvia Liberata Ullo Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint Samples IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 3-D Swin Transformer (3DST) feature fusion hyperspectral image classification (HSIC) spatial–spectral features spatial–spectral transformer (SST)
title	Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint Samples
title_full	Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint Samples
title_fullStr	Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint Samples
title_full_unstemmed	Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint Samples
title_short	Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint Samples
title_sort	fusing transformers in a tuning fork structure for hyperspectral image classification across disjoint samples
topic	3-D Swin Transformer (3DST) feature fusion hyperspectral image classification (HSIC) spatial–spectral features spatial–spectral transformer (SST)
url	https://ieeexplore.ieee.org/document/10685113/
work_keys_str_mv	AT muhammadahmad fusingtransformersinatuningforkstructureforhyperspectralimageclassificationacrossdisjointsamples AT muhammadusama fusingtransformersinatuningforkstructureforhyperspectralimageclassificationacrossdisjointsamples AT manuelmazzara fusingtransformersinatuningforkstructureforhyperspectralimageclassificationacrossdisjointsamples AT salvatoredistefano fusingtransformersinatuningforkstructureforhyperspectralimageclassificationacrossdisjointsamples AT hamadahmedaltuwaijri fusingtransformersinatuningforkstructureforhyperspectralimageclassificationacrossdisjointsamples AT silvialiberataullo fusingtransformersinatuningforkstructureforhyperspectralimageclassificationacrossdisjointsamples

Fusing Transformers in a Tuning Fork Structure for Hyperspectral Image Classification Across Disjoint Samples

Similar Items