Poudel et al. (2025) Uncertainty in estimating the relative change of design floods under climate change: a stylized experiment with process-based, deep learning, and hybrid models

Identification

Journal: Journal of Hydrology
Year: 2025
Date: 2025-10-17
Authors: Sandeep Poudel, Nasser Najibi, Scott Steinschneider
DOI: 10.1016/j.jhydrol.2025.134427

Research Groups

Department of Civil and Environmental Engineering, Cornell University, United States
Department of Agricultural and Biological Engineering, University of Florida, United States
Department of Biological and Environmental Engineering, Cornell University, United States

Short Summary

This study conducts a stylized model-as-truth experiment across 30 Massachusetts basins to evaluate uncertainty in estimating relative changes of design floods under climate change using process-based, deep learning, and hybrid hydrological models. Findings reveal that structural limitations and equifinality dominate uncertainty in change estimates, which are significantly reduced in variance through regional pooling.

Objective

To evaluate how errors in the estimates of change of design floods vary across process-based, deep learning-based, and hybrid hydrological models under different climate change scenarios.
To assess how uncertainty in historical precipitation impacts these errors.
To determine the sensitivity of these conclusions to calibration strategy (length of training data, calibration-stopping criteria, multi-objective optimization) and other modeling choices (e.g., design flood estimation method; regional pooling of estimates).

Study Configuration

Spatial Scale: 30 drainage basins in Massachusetts, United States, ranging from 25 square kilometers to 1785 square kilometers, with elevations between 18 meters and 507 meters above mean sea level.
Temporal Scale:
- Observed daily discharge: 1951–2015.
- Livneh climate data: 1951–2015 (for truth model calibration/validation).
- GHCN precipitation data: at least 21 years between 2000 and 2020.
- ERA5 Reanalysis temperature data.
- Synthetic climate series: 1000-year baseline and 1000-year future climate scenarios.
- Model training period: first 25 years of the 1000-year synthetic record (also 10 years for sensitivity analysis).
- Model testing period: next 15 years of data.

Methodology and Data

Models used:
- Truth Model: Conceptual Hydrologiska Byråns Vattenbalansavdelning (HBV) model (HBVTrue).
- Competing Process-based Models: Recalibrated HBV model (HBVRecalib), HYMODFull (16 parameters), HYMODSimple (9 parameters).
- Deep Learning Model: Regional Long Short-Term Memory (LSTM) network.
- Hybrid Models: HYMODSimple with LSTM post-processor (HYMODSimple,pp), HYMODFull with LSTM post-processor (HYMODFull,pp).
- Stochastic Weather Generator: Semiparametric, multivariate, multisite statistical model (Steinschneider et al., 2019).
- Extreme Value Distribution: Generalized Extreme Value (GEV) distribution (also Gumbel for sensitivity analysis).
- Fitting Methods: Maximum Likelihood Estimation (MLE) (also L-Moments for sensitivity analysis).
- Calibration Algorithms: Genetic algorithm (for process-based models), ADAM optimizer (for LSTM/hybrid models).
Data sources:
- Streamflow: Daily discharge from the United States Geological Survey (USGS) national water information system.
- Climate Data (for truth model): Daily precipitation and temperature from the Livneh dataset.
- Climate Data (for competing models): Gauge-based daily precipitation from the Global Historical Climatology Network (GHCN), temperature from the ERA5 Reanalysis product.
- Geophysical Attributes: Basin-specific attributes from the USGS GAGES-II database.

Main Results

All models exhibited substantial inter-basin variability in relative change errors for design floods across climate scenarios, with models having structural error also showing bias. Even the perfect-structure HBVRecalib model showed errors exceeding 50% for some basins when historical precipitation errors were greater than 1 millimeter per day.
Process-based models with greater structural uncertainty (HYMODSimple) produced extreme errors, surpassing 100% for some basins.
Deep learning post-processors improved historical streamflow prediction and modestly reduced bias in relative change estimates, particularly for models with more structural error (e.g., HYMODSimple,pp reduced bias by 5–10% at higher precipitation error levels). However, they did not reduce the error variance of the underlying process-based model.
The standalone LSTM model performed comparably to process models with low or no structural uncertainty, yielding relative change estimates with low bias (approximately ±10% error under perfect input) and variance, outperforming simpler process-based models.
Historical precipitation uncertainty degraded historical model performance but had limited impact on the bias of relative change estimates for design floods. Structural limitations and parametric uncertainty were found to dominate errors in relative change predictions, overshadowing the effects of input data quality.
Historical model performance (e.g., high Nash-Sutcliffe Efficiency) was a poor predictor of model accuracy for design flood change estimates under climate change.
Regional pooling, using a simple median-based approach, significantly reduced the variance in design flood change estimates across basins for all models, with minimal changes to bias. For instance, for HYMODSimple,pp, the error range for 4–6 millimeters per day precipitation error was reduced from approximately -50% to 150% to +5% to +25%.
Sensitivity analyses, including different model calibration schemes, alternative methods of flood quantile estimation, and an alternative truth model (GR4J), confirmed the primary findings.

Contributions

Quantifies both the bias and variance of relative change estimates in design floods, providing new insights into the probabilistic nature of these errors, which previous studies often represented as a range of outcomes.
Systematically evaluates the effectiveness of deep learning post-processors in reducing bias and variance in relative change estimates under climate change, highlighting their limited impact on variance despite improving historical fit.
Demonstrates the competitive performance of pure data-driven LSTM models in predicting relative changes in design floods with low bias and variance, even under varying precipitation uncertainty, in a stylized experimental setting.
Reveals that historical precipitation uncertainty has a limited impact on relative change estimates compared to structural and parametric uncertainties, and that historical model performance is an unreliable indicator for climate change impact model accuracy.
Introduces and validates regional pooling as an effective strategy to significantly reduce variance in design flood change estimates across basins, extending previous insights on regional flood magnitude estimation to the context of climate change impacts.
Provides practical guidance on model selection and regional aggregation to reduce uncertainty in hydrological projections, supporting more robust decision-making in water resource planning.

Funding

Massachusetts Executive Office of Energy and Environmental Affairs.

Citation

@article{Poudel2025Uncertainty,
  author = {Poudel, Sandeep and Najibi, Nasser and Steinschneider, Scott},
  title = {Uncertainty in estimating the relative change of design floods under climate change: a stylized experiment with process-based, deep learning, and hybrid models},
  journal = {Journal of Hydrology},
  year = {2025},
  doi = {10.1016/j.jhydrol.2025.134427},
  url = {https://doi.org/10.1016/j.jhydrol.2025.134427}
}

Original Source: https://doi.org/10.1016/j.jhydrol.2025.134427