Pang et al. (2026) Enhancing cloud detection across multiple satellite sensors using a combined Swin Transformer and UPerNet deep learning model
Identification
- Journal: Remote Sensing of Environment
- Year: 2026
- Date: 2026-01-09
- Authors: Shulin Pang, Zhanqing li, Lin Sun, Biao Cao Biao Cao, Zhihui Wang, Xinyuan Xi, Xiaohang Shi, Jie Xu, Jing Wei
- DOI: 10.1016/j.rse.2025.115206
Research Groups
- Innovation Research Center of Satellite Application, Faculty of Geographical Science, Beijing Normal University, Beijing, China
- Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD, USA
- College of Geomatics, Shandong University of Science and Technology, Qingdao, China
- University of Science and Technology of China, Hefei, China
- College of Marine Technology, Ocean University of China, Qingdao, China
- State Key Laboratory of Climate System Prediction and Risk Management, School of Atmospheric Physics, Nanjing University of Information Science and Technology, Nanjing, China
- School of Geography, Nanjing Normal University, Nanjing, China
- MEEKL-AERM, College of Environmental Sciences and Engineering, Institute of Tibetan Plateau, and Center for Environment and Health, Peking University, Beijing, China
Short Summary
This paper introduces STUPmask, a novel deep learning model combining Swin Transformer and UPerNet, to significantly enhance cloud detection accuracy across multiple satellite sensors and diverse imagery types. The model demonstrates improved performance in identifying challenging cloud types and exhibits strong adaptability to cross-sensor data with varying spatial resolutions.
Objective
- To develop an advanced deep learning method for robust and accurate cloud detection across multiple satellite sensors, spectral bands (visible to thermal infrared), and spatial resolutions (meters to kilometers), particularly for challenging cloud types (broken, thin, semi-transparent) and diverse surface conditions (including bright scenes).
Study Configuration
- Spatial Scale: Global, with applicability across spatial resolutions ranging from 4 meters to 2 kilometers.
- Temporal Scale: Applicable to diverse imagery types, implying broad temporal applicability for cloud detection, trained on global datasets.
Methodology and Data
- Models used: STUPmask, an encoder-decoder deep learning model combining Swin Transformer and UPerNet.
- Data sources: Landsat 8 and Sentinel-2 Manually Cloud Validation Mask datasets (for training and validation). Cross-sensor adaptability tested with data from GaoFen-2, MODIS, and Himawari-8.
Main Results
- STUPmask accurately estimates cloud amount with marginal differences against reference masks (0.27 % for Landsat 8 and -0.81 % for Sentinel-2).
- Achieved high overall classification accuracy for cloud distribution (97.51 % for Landsat 8 and 96.27 % for Sentinel-2).
- Excels in detecting broken, thin, and semi-transparent clouds across diverse surfaces, including bright surfaces like urban, barren lands, snow, and ice.
- Demonstrated strong adaptability to cross-sensor satellite data (GaoFen-2, MODIS, Himawari-8) with varying spatial resolutions (4 m–2 km) from Low-Earth-Orbit (LEO) and Geostationary-Earth-Orbit (GEO) platforms, achieving an overall accuracy of 94.21–97.11 %.
Contributions
- Introduction of STUPmask, a novel deep learning model integrating Swin Transformer and UPerNet, for advanced cloud detection.
- Demonstrated significant improvement in cloud detection accuracy and robust adaptability across a wide range of satellite sensors, spectral bands, spatial resolutions, and challenging surface types.
- Superior performance in identifying difficult cloud types (broken, thin, semi-transparent) over complex and bright surfaces.
- Provides a versatile and highly adaptable method for automatic cloud identification, enhancing remote sensing studies across various applications.
Funding
- Not specified in the provided paper text.
Citation
@article{Pang2026Enhancing,
author = {Pang, Shulin and li, Zhanqing and Sun, Lin and Cao, Biao Cao Biao and Wang, Zhihui and Xi, Xinyuan and Shi, Xiaohang and Xu, Jie and Wei, Jing},
title = {Enhancing cloud detection across multiple satellite sensors using a combined Swin Transformer and UPerNet deep learning model},
journal = {Remote Sensing of Environment},
year = {2026},
doi = {10.1016/j.rse.2025.115206},
url = {https://doi.org/10.1016/j.rse.2025.115206}
}
Original Source: https://doi.org/10.1016/j.rse.2025.115206