Pfreundschuh et al. (2026) A Benchmark Dataset for Satellite-Based Estimation and Detection of Rain

Identification

Journal: Scientific Data
Year: 2026
Date: 2026-01-15
Authors: Simon Pfreundschuh, Malarvizhi Arulraj, Linda Bogerd, Alan J. P. Calheiros, Daniele Casella, Neda Dolatabadi, Clément Guilloteau, Jie Gong, Christian D. Kummerow, Pierre Kirstetter, G. Lee, Maximilian Maahn, Lisa Milani, Giulia Panegrossi, Rayana Santos Araújo Palharini, Veljko Petković, Soorok Ryu, Paolo Sanò, Jackson Tan
DOI: 10.1038/s41597-026-06565-0

Research Groups

Department of Atmospheric Science, Colorado State University
Earth System Science Interdisciplinary Center, University of Maryland
Department of Hydrology and Atmospheric Sciences, University of Arizona
Instituto Nacional de Pesquisas Espaciais
Institute of Atmospheric Sciences and Climate, Italian National Research Council
Department of Civil and Environmental Engineering, University of California Irvine
NASA Goddard Space Flight Center
School of Meteorology & School of Civil Engineering and Environmental Science, University of Oklahoma
Department of Atmospheric Sciences, Kyungpook National University
Institute for Meteorology, Leipzig University
Departamento de Prevención de Riegos y Medio Ambiente, Universidad Tecnológica Metropolitana
Cooperative Institute for Satellite Earth System Studies, University of Maryland
University of Maryland, Baltimore County

Short Summary

This paper introduces SatRain, the first AI benchmark dataset for satellite-based rain detection and estimation, integrating multi-sensor satellite observations with high-quality ground-based radar and gauge reference data. It provides a standardized evaluation protocol and out-of-distribution test sets to enable robust and reproducible comparisons of machine learning approaches for precipitation retrieval.

Objective

To develop and provide SatRain, a standardized, AI-ready benchmark dataset for satellite-based detection and estimation of rain, addressing the lack of fair comparison between machine learning methods in precipitation retrieval.

Study Configuration

Spatial Scale:
- Training/Validation: Conterminous United States (CONUS).
- Independent Test Sets: South Korea, Austria (Feldbach region).
- Gridded data: 0.036° regular latitude-longitude grid.
- Native sensor sampling: GMI (7.2 km x 4.4 km to 32 km x 19 km footprint), ATMS (16 km to 75 km nadir footprint).
Temporal Scale:
- Training/Validation: 2018–2021 (first five days of each month for validation, rest for training).
- CONUS Test: 2022.
- Austria Test: 2021–2022.
- Korea Test: October 2022–October 2023.
- PMW observations: Discrete overpass times (GPM revisit times can exceed 3 hours).
- Geostationary observations: Multi-channel Vis/IR at 10-minute resolution (1 hour window around PMW overpass), single-channel IR at 30-minute resolution (8 hour window around PMW overpass).

Methodology and Data

Models used:
- For benchmarking: U-Net-type encoder-decoder based on EfficientNet-V2 architecture (14 million parameters), Random Forests, XGBoost, Multi-layer perceptron (MLP).
- Baselines for comparison: GPROF V7 (GPM operational PMW algorithm), ERA5 reanalysis.
Data sources:
- Passive Microwave (PMW) Satellite Observations: GPM Microwave Imager (GMI) aboard GPM Core Observatory, Advanced Technology Microwave Sounder (ATMS) aboard NOAA-20 satellite.
- Geostationary Visible and Infrared (Vis/IR) Observations: Advanced Baseline Imager (ABI) aboard GOES-16 (CONUS), Advanced Himawari Imager (AHI) aboard Himawari-8 and -9 (Korea), Spinning Enhanced Visible and Infrared Imager (SEVIRI) aboard Meteosat-10 (Austria).
- Gridded Geostationary IR Observations: Climate Prediction Center (CPC) global gridded geostationary IR dataset (11 µm infrared window).
- Ancillary Environmental Data: Dynamic variables from ERA5 reanalysis (e.g., 10-meter wind, 2-meter dew point/temperature, CAPE, sea ice concentration, sea surface temperature, skin temperature, snow depth/fall, surface pressure, total column cloud ice/liquid water, total column water vapor, total/convective precipitation, leaf area index), GPROF 18-class dynamic surface classification, NOAA Global Land One-kilometer Base Elevation (GLOBE) digital elevation model.
- Reference Precipitation Estimates:
  - CONUS: NOAA's gauge-corrected Multi-Radar Multi-Sensor (MRMS) product.
  - South Korea: Gauge-corrected ground-based radar rainfall estimates (optimized for Korean domain).
  - Austria: Gauge measurements from the WegenerNet gauge network.

Main Results

The SatRain dataset enables the training of machine learning precipitation retrievals that consistently outperform conventional baselines (GPROF V7, ERA5) across various evaluation tasks (precipitation rate estimation, probabilistic/deterministic detection of precipitation and heavy precipitation) and independent test domains (CONUS, Korea, Austria).
Specifically, the SatRain-trained GMI retrieval consistently showed better agreement with reference precipitation fields and higher accuracy metrics than the operational GPROF V7, despite using the same input observations.
SatRain-trained retrievals using geostationary measurements (Geo and Geo-IR) generally exhibited lower overall accuracy than PMW-based retrievals but still performed better than the ERA5 reanalysis.
Among different machine learning techniques, U-Net-based models consistently outperformed Random Forests, XGBoost, and Multi-layer perceptrons (MLP) across both GMI and ATMS observations and all three test domains.
While biases were observed in SatRain retrievals over out-of-distribution regions (Austria and Korea), these biases are also present in operational algorithms, and the relative performance ranking of retrieval methods remained largely consistent, suggesting good generalizability of SatRain-based evaluations.

Contributions

Development and release of SatRain, the first AI benchmark dataset specifically designed for satellite-based detection and estimation of rain, addressing a critical gap in standardized evaluation for machine learning precipitation retrieval.
Integration of a diverse set of multi-sensor satellite observations (PMW, geostationary Vis/IR) with high-quality, gauge-corrected ground-based radar and gauge reference data.
Provision of a standardized evaluation protocol and out-of-distribution testing data (from Asia and Europe) to enable robust, reproducible, and fair comparisons of machine learning algorithms, mitigating overfitting to regional characteristics.
Creation of an AI-ready dataset with data available on both a regular latitude-longitude grid and native sensor sampling, supporting various AI algorithm designs.
Establishment of a clear pathway for transferring methodological advances developed with SatRain into next-generation global precipitation datasets, thereby supporting progress in meteorological research and applications.

Funding

NASA grant 80NSSC22K0604
Copernicus Climate Change Service information

Citation

@article{Pfreundschuh2026Benchmark,
  author = {Pfreundschuh, Simon and Arulraj, Malarvizhi and Behrangi, Ali and Bogerd, Linda and Calheiros, Alan J. P. and Casella, Daniele and Dolatabadi, Neda and Guilloteau, Clément and Gong, Jie and Kummerow, Christian D. and Kirstetter, Pierre and Lee, G. and Maahn, Maximilian and Milani, Lisa and Panegrossi, Giulia and Palharini, Rayana Santos Araújo and Petković, Veljko and Ryu, Soorok and Sanò, Paolo and Tan, Jackson},
  title = {A Benchmark Dataset for Satellite-Based Estimation and Detection of Rain},
  journal = {Scientific Data},
  year = {2026},
  doi = {10.1038/s41597-026-06565-0},
  url = {https://doi.org/10.1038/s41597-026-06565-0}
}

Original Source: https://doi.org/10.1038/s41597-026-06565-0