Repository logo
 

Towards generating a pre-training image transformer framework for preserving spatio-spectral properties in hyperspectral satellite images

Abstract

Hyperspectral images facilitate advanced geospatial analysis without the need for expensive ground surveys. Machine learning approaches are particularly well-suited for handling the geospatial coverage required by these applications. While self-supervised learning is a promising methodology for managing voluminous datasets with limited labels, existing encoders in self-supervised learning face challenges when applied to hyperspectral images due to the large number of spectral channels. We propose a novel hyperspectral image encoding framework designed to generate highly representative embeddings for subsequent geospatial analysis. Our framework extends the Vision Transformer model with dynamic masking strategies to enhance model performance in regions with high spatial variability. We introduce a novel loss function that incorporates spectral quality metrics and employs the unique channel grouping strategy to leverage spectral similarity across channels. We demonstrate the effectiveness of our approach through a downstream model for estimating soil texture at a 30-meter resolution.

Description

Rights Access

Subject

masked autoencoder
vision transformer
remote sensing
hyperspectral

Citation

Associated Publications