Lee et al. (2025) A comparative assessment of a hybrid approach against conventional and machine-learning daily streamflow prediction in ungauged basins
Identification
- Journal: Journal of Hydrology Regional Studies
- Year: 2025
- Date: 2025-10-15
- Authors: Seung Cheol Lee, Daeha Kim
- DOI: 10.1016/j.ejrh.2025.102854
Research Groups
- Department of Civil Engineering, Jeonbuk National University, Jeonju-si, Jeonbuk State 54896, Republic of Korea
Short Summary
This study compared a hybrid model (differentiable Parameter Learning with HBV) against traditional HBV and a standalone LSTM for daily streamflow prediction in 671 ungauged basins across the contiguous United States. The LSTM achieved the highest predictive accuracy, but the hybrid model offered valuable diagnostic insights into model failure modes, revealing systematic low-flow truncation caused by specific parameter biases.
Objective
- To compare the dPL-based hybrid framework with established models (traditional regionalized HBV and standalone LSTM) in terms of predictive accuracy, interpretability, and failure modes.
- To identify novel diagnostic capabilities or practical insights provided by the hybrid model that are not achieved with purely conceptual or black-box approaches.
Study Configuration
- Spatial Scale: 671 minimally disturbed basins across the contiguous United States (CONUS), spanning drainage areas from approximately 4 square kilometers to 25,791 square kilometers. Basins were partitioned into seven hydrologically coherent zones.
- Temporal Scale: Daily records from 1 October 1980 to 30 September 2010 (30 years). The first 15 years (1 October 1980 to 30 September 1995) were used for model training/calibration, and the subsequent 15 years (1 October 1995 to 30 September 2010) for validation.
Methodology and Data
- Models used:
- HBV-light version 2.0: A conceptual rainfall-runoff model with 12 parameters. Calibrated using the Shuffled Complex Evolution (SCE-UA) algorithm minimizing a composite RMSE* objective function. Regionalized using proximity-based parameter averaging from the five nearest donor basins.
- Long Short-Term Memory (LSTM) network: A machine learning model for daily streamflow simulation, trained using the hydroDL library on PyTorch.
- Differentiable Parameter Learning (dPL): An LSTM-based parameter estimator (gA) that learns to estimate HBV parameters.
- Hybrid framework (dPL + HBV): Couples the dPL parameter estimator with the HBV-light simulator, trained end-to-end via backpropagation.
- Data sources:
- Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset: Comprising 671 basins, including daily discharges from the United States Geological Survey (USGS) gauge network and 35 basin-averaged descriptors (e.g., topography, climate, hydrology, land cover, soil, geology).
- Daymet daily weather product (1 km resolution): For precipitation, maximum/minimum temperature, and incoming radiation.
- Potential evapotranspiration (Ep): Computed by the temperature-based Hargreaves method, provided within CAMELS.
Main Results
- Predictive Performance (Ungauged Basins):
- Regionalized LSTM achieved the highest predictive accuracy (mean Kling-Gupta Efficiency (KGE) of 0.57 ± 0.41, Nash-Sutcliffe Efficiency (NSE) of 0.54 ± 0.52, and logarithmic NSE (LNSE) of 0.62 ± 0.30).
- The hybrid model (dPL+HBV) yielded comparable but lower performance (mean KGE of 0.41 ± 0.73, NSE of 0.46 ± 0.55, and LNSE of 0.31 ± 0.35).
- The traditional proximity-based HBV model showed similar performance to the hybrid model (mean KGE of 0.46 ± 0.60, NSE of 0.43 ± 0.55, and LNSE of 0.17 ± 0.73).
- The hybrid model exhibited greater robustness in low-flow simulation, with a lower standard deviation in LNSE (0.35) compared to traditional HBV (0.73).
- Model Failure Modes:
- LSTM consistently underestimated peak flows and misrepresented their timing.
- Traditional HBV exaggerated the recession limbs of the hydrograph.
- The hybrid model displayed a unique "low-flow truncation" phenomenon, failing to generate streamflow below a distinct threshold, particularly in arid regions.
- Diagnostic Insights from Hybrid Model: The low-flow truncation in the hybrid model was attributed to systematic biases in dPL's parameter estimation, specifically consistently small values for
parK2(baseflow recession coefficient) and high values forparK0(surface runoff rate) and lowparUZL(upper reservoir threshold). These settings lead to rapid routing of precipitation as overland flow and suppressed baseflow generation.
Contributions
- Provided a comprehensive comparative assessment of a dPL-based hybrid modeling framework against conventional regionalization and pure machine learning approaches for daily streamflow prediction in ungauged basins.
- Demonstrated the diagnostic utility of the hybrid framework by linking systematic predictive errors (low-flow truncation) to specific, interpretable parameters of the conceptual HBV model, thereby bridging the gap between black-box and process-based modeling.
- Showed that dPL, as an advanced regression-based regionalization, achieved predictive performance comparable to proximity-based methods, addressing a historical challenge in regionalization.
- Highlighted the scalability of dPL for complex hydrological models and its independence from donor basin selection.
- Identified a critical limitation of dPL: its potential to introduce hydrologically unrealistic parameter biases (e.g., low
parK2, highparK0) as it learns to compensate for structural deficiencies of the coupled conceptual model.
Funding
- National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2024-00416443).
Citation
@article{Lee2025comparative,
author = {Lee, Seung Cheol and Kim, Daeha},
title = {A comparative assessment of a hybrid approach against conventional and machine-learning daily streamflow prediction in ungauged basins},
journal = {Journal of Hydrology Regional Studies},
year = {2025},
doi = {10.1016/j.ejrh.2025.102854},
url = {https://doi.org/10.1016/j.ejrh.2025.102854}
}
Original Source: https://doi.org/10.1016/j.ejrh.2025.102854