Hu et al. (2026) Advancing hydrological prediction in South Africa with differentiable multi-source meteorological data fusion
Identification
- Journal: Journal of Hydrology Regional Studies
- Year: 2026
- Date: 2026-02-10
- Authors: Yuqian Hu, Chunxiao Zhang, Heng Peng Li, Rongrong Li, W. P. Chu, Hanguang Yu
- DOI: 10.1016/j.ejrh.2026.103238
Research Groups
- School of Artificial Intelligence, China University of Geosciences in Beijing, Beijing, China
- Chinese Academy of Surveying and Mapping, Beijing, China
- Hebei Key Laboratory of Geospatial Digital Twin and Collaborative Optimization, China University of Geosciences Beijing, Beijing, China
- Key Laboratory of Virtual Geographic Environment (Ministry of Education of PRC), Nanjing Normal University, Nanjing, Jiangsu, China
- Institute of Space and Earth Information Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China
- Department of Geography, McGill University, Montreal, Quebec, Canada
Short Summary
This study developed a differentiable multi-source meteorological data fusion framework for regional runoff prediction in 188 South African basins, which significantly outperformed non-fusion baselines by adaptively weighting precipitation sources without relying on ground observations. The framework achieved a median Nash–Sutcliffe efficiency of 0.38, representing a greater than 52 % improvement over single-source models and a 23 % increase over direct splicing.
Objective
- To develop and evaluate an end-to-end differentiable fusion framework for runoff prediction in South African catchments, demonstrating its advantages over traditional single-source forcing and simple multi-source splicing strategies.
- To investigate the spatiotemporal heterogeneity of dynamic weights assigned to different precipitation products by the fusion module and interpret how these preferences reflect underlying geographic and climatic characteristics of individual basins.
- To assess the extent to which performance gains from the fusion strategy are more pronounced in hydrologically vulnerable arid basins, illustrating the framework’s strengths in reproducing critical runoff processes under challenging conditions.
Study Configuration
- Spatial Scale: 188 small and medium-sized river basins across South Africa, with areas ranging from 10 to 2000 square kilometers, covering diverse climatic and physiographic settings.
- Temporal Scale: Daily data from 1985 to 2010. Training period: 1 October 1985 to 30 September 2005. Testing period: 1 October 2005 to 30 September 2010.
Methodology and Data
- Models used: Differentiable multi-source meteorological data fusion framework, end-to-end differentiable precipitation fusion Long Short-Term Memory (dPL) model, comprising a lightweight Feed-Forward Neural Network (FFNN) for the fusion module and a Long Short-Term Memory (LSTM) network for hydrological modeling.
- Data sources:
- Runoff data: Daily runoff series from the Global Runoff Data Center (GRDC).
- Precipitation data: CHIRPS (Climate Hazards Group InfraRed Precipitation with Station Data, ~0.05° spatial resolution), ERA5-Land (high-resolution reanalysis), and TAMSAT (satellite-based precipitation product optimized for Africa), all at daily scale.
- Other meteorological data: From ERA5-Land (0.1° spatial resolution, daily): 2-meter height air temperature, potential evaporation, surface net solar radiation, and surface pressure.
- Basin static attribute data: Basin area (square kilometers), mean daily precipitation (meters), aridity (ratio of mean potential evapotranspiration to mean precipitation), moisture index, seasonality, frequency of days with ≤5 × mean daily precipitation, groundwater table depth (centimeters), mean subgrid slope (meters per meter), forest cover extent (percent), silt fraction in soil (percent), clay fraction in soil (percent), soil erosion (kilograms per hectare per year), sand fraction in soil (percent), and surface elevation (meters).
Main Results
- The differentiable fusion framework consistently outperformed non-fusion baselines, achieving a median Nash–Sutcliffe efficiency (NSE) of 0.38, which is a >52 % improvement over the best single-source input model and a 23 % increase over the direct splicing method.
- The framework effectively mitigated data inconsistency and redundancy, significantly reducing physically unreasonable negative flow predictions (e.g., from >8 % to <1 % in representative arid basins).
- In arid regions, the fusion mechanism automatically captured the complementary strengths of different precipitation products, leading to greater performance improvements (ΔNSE negatively correlated with moisture index).
- Spatially, ERA5-Land was weighted most heavily overall, but CHIRPS and TAMSAT played important roles under specific topographic or climatic conditions.
- Fusion weights showed significant correlations with basin attributes: CHIRPS weights positively correlated with mean elevation (r = 0.53) and negatively with soil sand content (r = -0.41); ERA5-Land weights positively correlated with mean slope (r = 0.46) and negatively with mean elevation (r = -0.53); TAMSAT weights positively correlated with high precipitation frequency (r = 0.20) and negatively with slope (r = -0.32).
- In a 10-fold spatial cross-validation for ungauged basins, the model achieved a median NSE of 0.292, demonstrating robust generalization capability.
Contributions
- Proposed a novel task-driven differentiable fusion framework that jointly optimizes data fusion and runoff prediction in an end-to-end manner, eliminating the need for ground observations to calibrate fusion weights.
- Demonstrated significant performance improvements in runoff prediction in data-scarce regions (South Africa) compared to traditional single-source and direct splicing methods.
- Revealed the spatially differentiated value of different precipitation data sources and how fusion weights are controlled by basin characteristics, enhancing interpretability.
- Improved physical realism by significantly reducing unreasonable negative flow predictions, especially in drought-prone basins.
- Showcased robust generalization capability in ungauged basins through spatial cross-validation.
- Maintained parameter efficiency, achieving substantial performance gains without increasing model complexity compared to direct splicing.
Funding
- National Natural Science Foundation of China (grant number 42371425)
Citation
@article{Hu2026Advancing,
author = {Hu, Yuqian and Zhang, Chunxiao and Li, Heng Peng and Li, Rongrong and Chu, W. P. and Yu, Hanguang},
title = {Advancing hydrological prediction in South Africa with differentiable multi-source meteorological data fusion},
journal = {Journal of Hydrology Regional Studies},
year = {2026},
doi = {10.1016/j.ejrh.2026.103238},
url = {https://doi.org/10.1016/j.ejrh.2026.103238}
}
Original Source: https://doi.org/10.1016/j.ejrh.2026.103238