Hyperspectral images (HSIs) are highly complex, containing an enhanced spectral dimension compared to conventional images. Deep learning methods are increasingly being applied to process this three-dimensional data for hyperspectral image classification (HSIC). Existing convolutional and transformer-based methods often struggle with capturing fine-grained spectral-spatial dependencies, achieving high accuracy while balancing computational complexity. In order to solve these problems, we propose a novel SXSFormer network for HSIC. This novel approach integrates Squeeze and Expansion (SX) blocks into the Swin Transformer architecture to enhance feature extraction and attention mechanisms in the model. At its core, the novel SX Block recalibrates channel-wise features by temporarily expanding and then compressing channel dimensionality, allowing the model to focus on informative spectral bands and capture complex interdependencies. The SXSFormer, equipped with window-based multi-head self-attention, efficiently captures long-range dependencies with reduced computational complexity by partitioning input data into non-overlapping windows. Additionally, the SX block’s integration within the Swin Transformer enables improved global contextual understanding by selectively weighting the feature maps. We further refine the architecture using various components, such as patch extraction and embedding layers, and a patch merging strategy, ensuring efficient multi-scale feature extraction. Extensive experiments on four benchmark HSI datasets (SA, IP, PU, and KSC) demonstrate that the proposed model achieves remarkable test accuracies of 99.97%, 98.15%, 99.63%, and 98.14%, respectively, outperforming existing state-of-the-art methods. Our proposed approach also shows good generalization ability when applied to new datasets. Overall, our proposed approach represents a promising direction for HSIC.

SXSFormer: Spectral Squeeze and Expansion Swin Transformer Network for Hyperspectral Image Classification

Pau, Giovanni
2025-01-01

Abstract

Hyperspectral images (HSIs) are highly complex, containing an enhanced spectral dimension compared to conventional images. Deep learning methods are increasingly being applied to process this three-dimensional data for hyperspectral image classification (HSIC). Existing convolutional and transformer-based methods often struggle with capturing fine-grained spectral-spatial dependencies, achieving high accuracy while balancing computational complexity. In order to solve these problems, we propose a novel SXSFormer network for HSIC. This novel approach integrates Squeeze and Expansion (SX) blocks into the Swin Transformer architecture to enhance feature extraction and attention mechanisms in the model. At its core, the novel SX Block recalibrates channel-wise features by temporarily expanding and then compressing channel dimensionality, allowing the model to focus on informative spectral bands and capture complex interdependencies. The SXSFormer, equipped with window-based multi-head self-attention, efficiently captures long-range dependencies with reduced computational complexity by partitioning input data into non-overlapping windows. Additionally, the SX block’s integration within the Swin Transformer enables improved global contextual understanding by selectively weighting the feature maps. We further refine the architecture using various components, such as patch extraction and embedding layers, and a patch merging strategy, ensuring efficient multi-scale feature extraction. Extensive experiments on four benchmark HSI datasets (SA, IP, PU, and KSC) demonstrate that the proposed model achieves remarkable test accuracies of 99.97%, 98.15%, 99.63%, and 98.14%, respectively, outperforming existing state-of-the-art methods. Our proposed approach also shows good generalization ability when applied to new datasets. Overall, our proposed approach represents a promising direction for HSIC.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11387/195273
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact