Wang et al. (2025) Saudi Rainfall (SaRa): hourly 0.1° gridded rainfall (1979–present) for Saudi Arabia via machine learning fusion of satellite and model data
Identification
- Journal: Hydrology and earth system sciences
- Year: 2025
- Date: 2025-10-08
- Authors: Xuetong Wang, Raied Saad Alharbi, Oscar M. Baez‐Villanueva, Amy Green, Matthew F. McCabe, Yoshihide Wada, Albert I. J. M. van Dijk, Muhammad Adnan Abid, Hylke E. Beck
- DOI: 10.5194/hess-29-4983-2025
Research Groups
- Physical Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Department of Civil Engineering, College of Engineering, King Saud University, Riyadh, Saudi Arabia
- Hydro-Climate Extremes Lab (H-CEL), Ghent University, Ghent, Belgium
- School of Engineering, Newcastle University, Newcastle upon Tyne, UK
- Tyndall Centre for Climate Change Research, Newcastle University, Newcastle upon Tyne, UK
- Fenner School of Environment & Society, Australian National University, Canberra, ACT, Australia
- Atmospheric, Oceanic and Planetary Physics (AOPP), Department of Physics, University of Oxford, Oxford, UK
- National Centre for Atmospheric Science (NCAS), Leeds, UK
- GloH2O LLC, Princeton, NJ, USA
Short Summary
This paper introduces Saudi Rainfall (SaRa), a high-resolution, hourly, gridded precipitation product for the Arabian Peninsula developed using machine learning fusion of satellite and model data. SaRa significantly outperforms 19 other state-of-the-art precipitation products in the region across various evaluation metrics.
Objective
- To develop and evaluate Saudi Rainfall (SaRa), a high-resolution (hourly, 0.1°), gridded, historical and near-real-time precipitation product for the Arabian Peninsula, leveraging machine learning to fuse satellite and model data, thereby addressing the critical need for accurate precipitation data in this data-sparse and water-stressed region.
Study Configuration
- Spatial Scale: Arabian Peninsula, with a grid resolution of 0.1° (approximately 11 km at the equator).
- Temporal Scale: 1979–present, with an hourly resolution and a near-real-time latency of less than 2 hours. The training and evaluation period was 2010–2024.
Methodology and Data
- Models used:
- 18 machine learning (ML) model stacks, each comprising four submodels.
- XGBoost models for daily, 3-hourly, and hourly precipitation disaggregation.
- Random Forest model for 3-hourly precipitation probability distribution correction.
- Data sources:
- Dynamic Predictors (Satellite & Reanalysis/Analysis Precipitation and Temperature Products): IMERG-E V07, IMERG-L V07, IMERG-F V07, GSMaP-NRT V8, GSMaP-MVK V8, PDIR-Now, PERSIANN-CCS-CDR, CMORPH-RAW, CMORPH-RT, PERSIANN-CCS, SM2RAIN-CCI, SM2RAIN-ASCAT, SM2RAIN-GPM, CHIRP V2, CHIRPS V2, MSWEP V2.8, ERA5 (precipitation and temperature), GDAS (precipitation and temperature), JRA-3Q.
- Precipitation Observations (Training Target & Evaluation Reference):
- Hourly and daily gauge observations from 14,256 global stations (for training) and 119 independent stations in Saudi Arabia (for evaluation).
- Specific datasets include EUropean RADar CLIMatology (EURADCLIM) for Europe, Stage-IV for the conterminous US, Global Historical Climatology Network-Daily (GHCN-D), Global Summary Of the Day (GSOD), Latin American Climate Assessment & Dataset (LACA&D), Chile Climate Data Library, national datasets for Brazil, Mexico, Peru, Iran, Global Sub-Daily Rainfall (GSDR), and Integrated Surface Database (ISD).
- Static Predictors: Aridity Index (AI), Mean Annual Precipitation (Pmean), Effective Terrain Height (ETH), Latitude (Lat), Longitude (Lon), Absolute Latitude (AbsLat).
- Sources for static predictors include CHELSA V2.1, Trabucco and Zomer (2018), ERA5, and Global Multi-resolution Terrain Elevation Data (GMTED) 2010.
Main Results
- SaRa (primary model_01) achieved a median Kling–Gupta efficiency (KGE) of 0.36, outperforming all 19 other evaluated precipitation products in the Arabian Peninsula (e.g., ERA5: 0.21, IMERG-L V07: -0.39, MSWEP V2.8: 0.20).
- SaRa demonstrated a low peak bias of -11.17 % and a moderate wet-day bias of +1.42 days, with a median Critical Success Index (CSI) of 0.21 for precipitation events exceeding 10 mm d⁻¹.
- Machine learning models incorporating a larger number of dynamic predictors generally showed better performance, with model_06 (using four dynamic predictors) achieving the highest median KGE of 0.43.
- The spatial generalizability of SaRa models was satisfactory, showing no clear decline in KGE with increasing distance from training stations.
- Predictor importance analysis for model_01 indicated that IMERG-L V07 and ERA5 precipitation were the most important dynamic predictors, while static predictors like longitude, latitude, and absolute latitude accounted for regional variability.
- The mean annual precipitation for Saudi Arabia during 1991–2020, based on SaRa, is estimated at 64 mm yr⁻¹.
- The average annual maximum daily precipitation is 19 mm d⁻¹, and the average annual maximum hourly precipitation is 6.9 mm h⁻¹.
- Saudi Arabia experiences an average of 10 rainy days per year (≥0.5 mm d⁻¹) and 51 rainy hours per year (≥0.1 mm h⁻¹).
- Trend analysis from 1979 to 2023 using SaRa revealed declines in mean annual precipitation (-0.50 % yr⁻¹), daily precipitation frequency (-0.11 % yr⁻¹), and annual maximum daily precipitation (-0.58 % yr⁻¹), corresponding to cumulative reductions of -22.5 %, -5.0 %, and -26.1 % over 45 years, respectively, though most trends were not statistically significant.
Contributions
- Development of SaRa, the first high-resolution (hourly, 0.1°), long-term (1979-present), and near-real-time precipitation product specifically for the Arabian Peninsula, a critically data-sparse and water-stressed region.
- Successful application of advanced machine learning model stacks to optimally fuse diverse satellite and model precipitation data, resulting in superior performance compared to existing state-of-the-art products in the region.
- Conducted the most comprehensive daily evaluation of gridded precipitation products in the Arabian Peninsula to date, using a robust set of independent gauge observations.
- Provides a robust and reliable dataset essential for supporting hydrological modeling, water resource assessments, flood management, and climate research in the Arabian Peninsula.
- Establishes a potential framework for developing consistent long-term precipitation datasets in other arid and dryland regions globally.
Funding
- KAUST’s Center of Excellence for Generative AI (grant no. 5940).
Citation
@article{Wang2025Saudi,
author = {Wang, Xuetong and Alharbi, Raied Saad and Baez‐Villanueva, Oscar M. and Green, Amy and McCabe, Matthew F. and Wada, Yoshihide and Dijk, Albert I. J. M. van and Abid, Muhammad Adnan and Beck, Hylke E.},
title = {Saudi Rainfall (SaRa): hourly 0.1° gridded rainfall (1979–present) for Saudi Arabia via machine learning fusion of satellite and model data},
journal = {Hydrology and earth system sciences},
year = {2025},
doi = {10.5194/hess-29-4983-2025},
url = {https://doi.org/10.5194/hess-29-4983-2025}
}
Original Source: https://doi.org/10.5194/hess-29-4983-2025