Yue et al. (2026) Improving Daily Precipitation Estimates through Machine Learning-Based Downscaling, Precipitation Event Classification, and Categorical Merging

Identification

Journal: Water Resources Management
Year: 2026
Date: 2026-02-27
Authors: Zhenzhen Yue, Lihua Xiong, Chenguang Xiang
DOI: 10.1007/s11269-026-04492-8

Research Groups

PowerChina Huadong Engineering Corporation Limited, Hangzhou, Zhejiang, China
State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan, China
PowerChina Kunming Engineering Corporation Limited, Kunming, China

Short Summary

This study proposes a three-step machine learning framework for multi-source precipitation merging, integrating downscaling, precipitation event classification, and categorical merging. The framework developed a high-resolution (1 km, daily) merged precipitation dataset (MSMP) for the Pearl River Basin, demonstrating significantly improved accuracy, especially for heavy and extreme precipitation, compared to existing products.

Objective

To overcome limitations of existing multi-source precipitation merging (MSP) methods that assume fixed relationships and often degrade performance during heavy and extreme rainfall events, by developing a novel three-step merging framework that integrates downscaling, precipitation event classification, and categorical merging using machine learning to capture nonlinear precipitation behavior and produce a high-resolution, accurate merged precipitation dataset.

Study Configuration

Spatial Scale: Pearl River Basin, South China; 1 km spatial resolution.
Temporal Scale: 1981–2020; Daily temporal resolution.

Methodology and Data

Models used:
- Random Forest (RF) for precipitation fusion (downscaling, classification, merging).
- Ordinary Kriging (OK) for interpolation and resampling.
- Spatial Random Forest (SRF) model for downscaling with spatial autocorrelation.
- Comparative analysis also included XGBoost and GBDT.
Data sources:
- Gauge precipitation: Daily precipitation from 48 gauges provided by the China Meteorological Administration (CMA) for 1981–2020.
- Multi-source precipitation products:
  - APHRODITE (V1101): 0.25° × 0.25°, 1981–2000.
  - GSMaP_Gauge (Version 06 Final Run): 0.1° × 0.1°, 2001–2020.
  - IMERG-F (Version 06): 0.1° × 0.1°, 2001–2020.
  - CHIRPS: 0.05° × 0.05°, 1981–2020.
  - ERA5-Land reanalysis: 0.1° × 0.1°, 1981–2020.
- Environmental variables:
  - Total Column Water Vapor (TCWV) from ERA5 reanalysis (0.25° × 0.25°, 1981–2020).
  - Geospatial variables (elevation, longitude, latitude, slope, aspect) derived from Shuttle Radar Topography Mission (SRTM) DEM (90 m × 90 m).

Main Results

The Multi-Source Merging Precipitation dataset (MSMP) consistently outperformed original precipitation products and conventional MSP methods in both statistical and categorical metrics.
MSMP achieved a correlation coefficient (CC) of 0.88, representing a 10%–60% improvement over original products.
MSMP showed an RMSE of 5.21 mm, corresponding to a 25%–59% reduction, and a Kling-Gupta Efficiency (KGE) of 0.88, reflecting a 20%–60% improvement.
MSMP demonstrated superior performance for heavy and extreme precipitation events, achieving the lowest False Alarm Rate (FAR) and highest Critical Success Index (CSI).
The multi-class classification strategy consistently outperformed binary classification across all tested machine learning models, with the Random Forest-based multi-class scheme showing the best overall performance.
Variable importance analysis confirmed that gauge precipitation data and precipitation classification variables were the most influential inputs in both classification and regression models.
MSMP exhibited improved spatial accuracy and more stable and reliable detection performance across all seasons and subregional scales within the Pearl River Basin.

Contributions

Proposes a novel three-step machine learning-based framework (downscaling, precipitation event classification, categorical merging) that explicitly accounts for nonlinear relationships between precipitation intensity and environmental variables.
Develops a high-resolution (1 km, daily) merged precipitation dataset (MSMP) for the Pearl River Basin (1981–2020) with significantly enhanced accuracy, particularly for heavy and extreme precipitation events.
Demonstrates the superior performance of multi-class precipitation event classification over binary classification in improving fusion accuracy and reliability.
Provides more reliable precipitation inputs for hydrological modeling, water allocation planning, reservoir operation, and flood risk management, especially in extreme hydrological scenarios.

Funding

Postdoctoral project of POWERCHINA Huadong Engineering Corporation Limited (KY2024-NGH-02–06)
National Natural Science Foundation of China (NSFC Grants U2240201)
Yunnan International Joint R&D Center for Basin-scale Water-Energy-Ecology Regulation (Grant No. 202503AP140045)
Science and Technology Project of Power China Kunming Engineering Corporation Limited (KD-ZDYF2024-085)

Citation

@article{Yue2026Improving,
  author = {Yue, Zhenzhen and Xiong, Lihua and Xiang, Chenguang},
  title = {Improving Daily Precipitation Estimates through Machine Learning-Based Downscaling, Precipitation Event Classification, and Categorical Merging},
  journal = {Water Resources Management},
  year = {2026},
  doi = {10.1007/s11269-026-04492-8},
  url = {https://doi.org/10.1007/s11269-026-04492-8}
}

Original Source: https://doi.org/10.1007/s11269-026-04492-8