Arabzadeh et al. (2026) A surrogate-aided approach for accelerated Bayesian calibration of hydrologic models
Identification
- Journal: Environmental Modelling & Software
- Year: 2026
- Date: 2026-01-24
- Authors: Rezgar Arabzadeh, Jonathan Romero-Cuellar, James Craig, Bryan A. Tolson, Robert Chlumsky
- DOI: 10.1016/j.envsoft.2026.106894
Research Groups
- University of Waterloo, Department of Civil and Environmental Engineering, Canada
Short Summary
This study introduces a surrogate-aided error model (SHA) using Support Vector Regression (SVR) to decouple the inference of hydrologic and error model parameters in Bayesian calibration. The approach significantly accelerates convergence, requiring approximately 50% fewer samples, and consistently improves or maintains predictive accuracy across 12 MOPEX watersheds using the GR4J model.
Objective
- To develop and demonstrate a surrogate-aided approach for Bayesian calibration of hydrologic models that decouples the inference of hydrologic and error model parameters, thereby accelerating convergence and improving predictive accuracy.
Study Configuration
- Spatial Scale: 12 MOPEX watersheds across the continental United States (specifically the southeastern portion), ranging from 1023 km² to 4419 km² in area. Each watershed was modeled as a single Hydrologic Response Unit (HRU) using a lumped model approach.
- Temporal Scale: 56 years of daily hydrometeorological and streamflow data (1948–2003) for calibration and evaluation.
Methodology and Data
- Models used:
- Hydrologic model: GR4J (with CEMANeige snow model and lapse-rate adjustments) implemented in the Raven framework.
- Surrogate model: Support Vector Regression (SVR) for estimating error model parameters.
- Inference algorithm: No-U-Turn Sampler (NUTS) from the LaplacesDemon R package for Bayesian inference.
- Error models benchmarked: Surrogate-aided Heteroscedastic Autocorrelated error model (SHA), Heteroscedastic Autocorrelated error model with Student's t-distribution (THA), Homoscedastic, uncorrelated error model with Gaussian distribution (G), Gaussian Heteroscedastic Autocorrelated model (GHA).
- Data sources:
- MOPEX12 watersheds hydrometeorological and streamflow data, publicly available from NOAA (ftp://hydrology.nws.noaa.gov/pub/gcip/mopex/USData/).
- Data and model setups archived at https://github.com/rarabzad/GR4JMOPEX/tree/main.
Main Results
- The surrogate-aided approach (SHA) achieved significantly faster convergence in Bayesian calibration, reaching the Gelman-Rubin (GR) convergence threshold of 1.1 at approximately 40,000 samples for the Guadalupe watershed, while other methods (THA, GHA) persisted above this threshold.
- SHA required approximately 49.2% fewer MCMC samples to converge compared to THA (38,350 ± 3795 samples vs. 75,430 ± 7230 samples) and 52.1% fewer than GHA (38,350 ± 3795 samples vs. 80,130 ± 6358 samples).
- This reduction in sample size translated to substantial computational time savings, with SHA completing calibration in 6.5 ± 0.6 hours, which is 48.4% faster than THA (12.6 ± 1.2 hours) and 51.5% faster than GHA (13.4 ± 1.1 hours).
- SHA consistently demonstrated superior predictive accuracy, exhibiting the lowest median Continuous Ranked Probability Score (CRPS) and a narrower interquartile range across all 12 MOPEX watersheds.
- The method achieved a favorable balance between predictive reliability (coverage of observed flows) and sharpness (narrowness of credible intervals).
- The SVR model effectively mapped hydrologic parameters to error model parameters, achieving a mean R² of 0.89 ± 0.06 for error model parameter prediction across all MOPEX12 watersheds with 1000 training samples.
Contributions
- Introduces a novel surrogate-aided error model (SHA) that decouples the inference of hydrologic and error model parameters in Bayesian calibration, addressing the challenge of slow convergence due to high-dimensional interactions.
- Demonstrates the effective use of Support Vector Regression (SVR) as a surrogate to deterministically estimate error model parameters conditional on hydrologic parameters within an MCMC sampler.
- Achieves significant computational efficiency gains, reducing both the required number of MCMC samples (approximately 50%) and calibration runtime (approximately 50%) compared to traditional joint inference methods.
- Consistently improves or maintains predictive accuracy, as evidenced by lower CRPS values and a better balance of reliability and sharpness in probabilistic streamflow predictions.
- Provides a promising framework for efficient operational hydrologic applications, large-scale modeling, and real-time streamflow forecasting, particularly when computational resources are limited.
Funding
- NSERC Discovery grants: RGPIN-2017-03920, RGPIN-2023-03459 (J. Craig); RGPIN-2022-03890 (B. Tolson)
- NSERC Tier II Canada Research Chair in Hydrologic Modeling and Analysis: CRC-2020-00176 (J. Craig)
- NSERC Collaborative Research and Development Grant: 536471-18 (supported by Ontario Power Generation)
Citation
@article{Arabzadeh2026surrogateaided,
author = {Arabzadeh, Rezgar and Romero-Cuellar, Jonathan and Craig, James and Tolson, Bryan A. and Chlumsky, Robert},
title = {A surrogate-aided approach for accelerated Bayesian calibration of hydrologic models},
journal = {Environmental Modelling & Software},
year = {2026},
doi = {10.1016/j.envsoft.2026.106894},
url = {https://doi.org/10.1016/j.envsoft.2026.106894}
}
Original Source: https://doi.org/10.1016/j.envsoft.2026.106894