Weng et al. (2026) Scenario-driven probabilistic streamflow forecast based on the conditional vine copula and denoising diffusion probabilistic model
Identification
- Journal: Journal of Hydrology Regional Studies
- Year: 2026
- Date: 2026-01-06
- Authors: Peiyao Weng, Yü Tian, Yunzhong Jiang, Yu Qiao, Lingzhong Kong
- DOI: 10.1016/j.ejrh.2025.103069
Research Groups
- School of Civil Engineering, Tianjin University, Tianjin 300072, China
- China Institute of Water Resources and Hydropower Research, Beijing 100038, China
- China South-to-North Water Diversion Corporation Limited, Beijing 100070, China
- College of Hydraulic Science and Engineering, Yangzhou University, Yangzhou 225009, China
Short Summary
This study proposes a novel scenario-driven probabilistic streamflow forecasting framework that integrates a conditional vine copula (CVC) model with a UNet-based Conditional Enhanced Denoising Diffusion Probabilistic Model (U-CEDDPM). The framework demonstrates superior deterministic and probabilistic performance for long-term streamflow prediction in Northern China's Hongze and Luoma Lakes, significantly outperforming benchmark generative models.
Objective
- To develop and validate a novel scenario-driven probabilistic streamflow forecasting framework by integrating a conditional vine copula (CVC) model with a UNet-based Conditional Enhanced Denoising Diffusion Probabilistic Model (U-CEDDPM).
- To assess the framework's ability to capture streamflow-teleconnection dependencies for dry/wet classification and generate high-fidelity probabilistic streamflow scenarios for long-term prediction (1-12 months ahead).
Study Configuration
- Spatial Scale: The study focused on two lakes in Northern China: Hongze (HZ) Lake, with a drainage area of 2596 km² in the lower reaches of the Huai River, and Luoma (LM) Lake, with a drainage area of 375 km² in the Yi-Shu-Si River basin. Both are critical water sources for China’s Eastern Route South-North Water Diversion Project.
- Temporal Scale: Monthly streamflow data from 1972 to 2019 (48 years) were used. The dataset was partitioned into a calibration period (1972–2009) and a validation period (2010–2019). Forecasts were generated for horizons ranging from 1 to 12 months ahead, operating on a rolling annual cycle from October to September.
Methodology and Data
- Models used:
- Proposed Framework: Integrated Conditional Vine Copula (CVC) model with UNet-based Conditional Enhanced Denoising Diffusion Probabilistic Model (U-CEDDPM).
- CVC: Utilized C-vine and D-vine copula structures, fitted with Gamma, Gaussian, Lognormal, Generalized Extreme Value, and Weibull marginal distributions, and Gaussian, Student t, Clayton, Gumbel, Frank, and Joe bivariate copulas. Monte Carlo simulations were used for probabilistic realizations.
- U-CEDDPM: Employed a UNet-based denoising network incorporating an Enhanced Conditional Attention Module, an Attention-Enhanced UNet, and a Categorical Feature Integration Module.
- Benchmark Generative Models: Conditional Variational Autoencoder (CVAE), Conditional Generative Adversarial Network (CGAN), and Conditional Denoising Diffusion Probabilistic Model (CDDPM).
- Proposed Framework: Integrated Conditional Vine Copula (CVC) model with UNet-based Conditional Enhanced Denoising Diffusion Probabilistic Model (U-CEDDPM).
- Data sources:
- Monthly streamflow data for HZ Lake and LM Lake (1972–2019) obtained from basin authority monitoring stations.
- Candidate predictors included antecedent monthly flows (Q), total flow of previous water-year (AQ), total flow during non-flood seasons of previous water-year (NFQ), and total flow during flood seasons of previous water-year (FQ).
- Teleconnection factors comprised 88 atmospheric circulation indices (AC) and 26 sea surface temperature indices (SST) from the National Climate Center of China Meteorology Administration.
Main Results
- CVC Performance: Teleconnection-enhanced CVC significantly improved water-year dry/wet classification for HZ Lake, achieving 20.19 % higher precision and 15.39 % better recall compared to the teleconnection-excluded configuration, particularly for extreme dry, wet, and normal events. For LM Lake, teleconnection integration yielded marginal gains.
- U-CEDDPM Deterministic Performance:
- HZ Lake: Achieved a Nash-Sutcliffe Efficiency (NSE) of 0.72, Root Mean Square Error (RMSE) of 654.22 m³/s, and Mean Absolute Error (MAE) of 365.11 m³/s. This represented a 157.14 % higher NSE, 37.35 % lower RMSE, and 49.65 % lower MAE compared to CVAE.
- LM Lake: Achieved an NSE of 0.67, RMSE of 157.49 m³/s, and MAE of 77.06 m³/s. This represented a 204.55 % higher NSE, 34.65 % lower RMSE, and 46.65 % lower MAE compared to CVAE.
- U-CEDDPM Probabilistic Performance:
- HZ Lake: Achieved a Prediction Interval Coverage Probability (PICP) at the 95 % confidence level (PICP95) of 66.28 % and a Coverage Width-based Criterion (CWC80) of 0.92, demonstrating 7–17 times better performance in balancing reliability and sharpness than benchmark models.
- LM Lake: Achieved a PICP95 of 68.75 % and a CWC80 of 0.39, representing approximately 13–16 times improvement over benchmark models.
- U-CEDDPM consistently showed superior reliability from October to June but experienced degradation in July and August (e.g., PICP80 for HZ Lake decreased to 20.22 % in July).
- Scenario Generation: U-CEDDPM effectively captured short- and long-term temporal dependencies, with its probability density function (PDF) estimates showing improved alignment with observed streamflow, especially during flood seasons. A subtle overestimation of cold-season streamflow (December-February) was noted.
Contributions
- Proposed a novel integrated scenario-driven probabilistic streamflow forecasting framework combining the strengths of conditional vine copulas and an enhanced denoising diffusion probabilistic model (U-CEDDPM).
- Quantitatively demonstrated the significant value of incorporating teleconnection factors for improving long-term (water-year) dry/wet classification, particularly for larger river basins.
- Introduced U-CEDDPM, which significantly outperforms existing generative deep learning models (CVAE, CGAN, CDDPM) in both deterministic accuracy and probabilistic uncertainty characterization for streamflow forecasting.
- Provided a robust and reliable framework for generating high-fidelity probabilistic streamflow scenarios, offering critical support for adaptive water resource management.
Funding
- Zhongyuan Thousand Talents Plan (Project No. 254000510037)
- Key Research and Development Project of Henan Province (Grant No. 251111210700)
Citation
@article{Weng2026Scenariodriven,
author = {Weng, Peiyao and Tian, Yü and Jiang, Yunzhong and Qiao, Yu and Kong, Lingzhong},
title = {Scenario-driven probabilistic streamflow forecast based on the conditional vine copula and denoising diffusion probabilistic model},
journal = {Journal of Hydrology Regional Studies},
year = {2026},
doi = {10.1016/j.ejrh.2025.103069},
url = {https://doi.org/10.1016/j.ejrh.2025.103069}
}
Original Source: https://doi.org/10.1016/j.ejrh.2025.103069