Staudinger et al. (2025) How well do process-based and data-driven hydrological models learn from limited discharge data?

Identification

Journal: Hydrology and earth system sciences
Year: 2025
Date: 2025-10-08
Authors: Maria Staudinger, Anna Herzog, Ralf Loritz, Tobias Houska, Sandra Pool, Diana Spieler, Paul D. Wagner, Juliane Mai, Jens Kiesel, Stephan Thober, Björn Guse, Uwe Ehret
DOI: 10.5194/hess-29-5005-2025

Research Groups

Department of Geography, University of Zurich, Switzerland
Department of Hydrology and Climatology, Institute of Environmental Science and Geography, University of Potsdam, Germany
Institute of Water and Environment, Karlsruhe Institute of Technology (KIT), Germany
Department of Landscape Ecology and Resources Management, University of Gießen, Germany
Department Water Resources and Drinking Water, Eawag – Swiss Federal Institute of Aquatic Science and Technology, Switzerland
Department of Hydrosciences, Institute of Hydrology and Meteorology, TUD Dresden University of Technology, Germany
Department of Hydrology and Water Resources Management, Institute for Natural Resource Conservation, Kiel University, Germany
Earth and Environmental Science, University of Waterloo, Canada
Computational Hydrosystems, Helmholtz Centre for Environmental Research – UFZ, Germany
German Research Centre for Geosciences, Section Hydrology, Potsdam, Germany
Stone Environmental, USA
Schulich School of Engineering, University of Calgary, Canada

Short Summary

This study systematically compares the learning behavior of process-based and data-driven hydrological models under varying discharge data availability, selection strategies, and spatial input resolutions. It finds that while process-based models initially outperform data-driven ones with limited data, Long Short-Term Memory (LSTM) networks achieve superior and continuously improving performance with sufficient training data, demonstrating the critical role of data quantity, memory, and spatial input in model learning.

Objective

To investigate how well process-based and data-driven hydrological models learn from limited discharge data.
To determine if there is a dataset size beyond which data-driven models outperform process-based models.
To examine how different training data selection schemes (random, according to information content, contiguous, independent) affect model performance.
To assess whether analyzing the information content of catchment data can predict the achievable performance of different model types.
To investigate if using more spatially distributed model inputs improves model performance compared to spatially lumped inputs.

Study Configuration

Spatial Scale: Three meso-scale catchments in Germany (Iller: 2140 km², Saale: 1011 km², Selke: 461 km²), representing Alpine, low mountain range, and transition landscapes. Input data resolutions include 100 m (DEM, land cover, soil map) and 1 km (gridded precipitation, temperature, potential evapotranspiration). Models used lumped, semi-distributed (sub-basins, HRUs), and distributed spatial discretizations.
Temporal Scale: Dynamic input data (precipitation, air temperature, potential evapotranspiration, discharge) available daily for 2000–2015. Training period: 1 January 2001 to 31 December 2010. Validation period: 1 January 2012 to 31 December 2015. Warm-up periods used for both training and validation. Training sample sizes ranged from 2 to 3654 daily data points.

Methodology and Data

Models used:
- Process-based: GR4J (Génie Rural à 4 paramètres Journalier), HBV (Hydrologiska Byråns Vattenbalansavdelning), SWAT+ (Soil Water Assessment Tool Plus).
- Data-driven: EDDIS (Empirical Discrete Distributions), RTREE (Regression Tree), ANN (Artificial Neural Network), LSTM (Long Short-Term Memory network).
Data sources:
- Static: DEM100 (100 m resolution, Yamazaki et al., 2019), CORINE land cover (100 m, CLMS, 2019), Soil map (BÜK200, 100 m resampled, BGR, 1999).
- Dynamic (daily): Gridded precipitation and air temperature (1 km, interpolated from DWD station data), potential evapotranspiration (1 km, estimated using Hargreaves and Samani equation), discharge (gauge observations from local authorities).
Performance Measures: Conditional entropy (Hc) and Joint entropy (Hj) for system analysis and model performance evaluation. Kling–Gupta efficiency (KGE) also provided for comparison.
Training Objective Functions: KGE for process-based models, Root Mean Square Error (RMSE) for RTREE, Mean Squared Error (MSE) for ANN and LSTM. EDDIS required no training.
Sampling Schemes: Fully random, random consecutive, and optimal (Douglas–Peucker algorithm) sampling of training data.

Main Results

Model Learning Behavior: Process-based models (GR4J, HBV, SWAT+) initially outperformed data-driven models with small training datasets due to their predefined hydrological structure. However, their learning curves quickly saturated (around 500 data points). Data-driven models, particularly the LSTM network, continued to learn with increasing data, outperforming all process-based models when trained with more than 2–5 years of data and showing no saturation.
Impact of Sampling Strategy: For the HBV model, fully random sampling of training data points generally led to better learning results than consecutive random sampling or optimal sampling using the Douglas–Peucker algorithm, especially for the Selke catchment.
Role of Memory: Introducing memory (e.g., previous day or week's data) significantly reduced the conditional entropy of discharge, indicating improved model performance. This effect was particularly pronounced in the Iller (Alpine) and Saale (low mountain range) catchments.
Information Content and Learnability: The catchment with the highest joint entropy (Iller, highest data variability) also exhibited the highest learnability (lowest conditional entropy of discharge given inputs), while the catchment with the lowest joint entropy (Selke) showed the lowest learnability.
Spatial Discretization: Using semi-distributed input data for the HBV model generally improved model performance compared to lumped inputs, especially for the Iller and Selke catchments, which have more heterogeneous topography and precipitation patterns. The improvement was less pronounced for the Saale catchment.

Contributions

Provides a systematic and comprehensive comparison of the learning capabilities of a diverse set of process-based and data-driven hydrological models under varying data availability and selection strategies.
Quantifies the "data threshold" at which advanced data-driven models (LSTMs) surpass traditional process-based models in performance, highlighting the continuous learning capacity of LSTMs.
Evaluates the effectiveness of different training data sampling schemes (fully random, consecutive random, optimal) for hydrological model calibration, offering practical insights for data collection and model training.
Demonstrates the utility of information theory measures (joint and conditional entropy) for characterizing catchment data variability, assessing model learnability, and evaluating model performance in a dimensionless and comparable manner.
Investigates the impact of spatial input data discretization on model learning, showing that more spatially explicit inputs can enhance performance, particularly in heterogeneous catchments.

Funding

Deutsche Forschungsgemeinschaft (grant no. 471280762)

Citation

@article{Staudinger2025How,
  author = {Staudinger, Maria and Herzog, Anna and Loritz, Ralf and Houska, Tobias and Pool, Sandra and Spieler, Diana and Wagner, Paul D. and Mai, Juliane and Kiesel, Jens and Thober, Stephan and Guse, Björn and Ehret, Uwe},
  title = {How well do process-based and data-driven hydrological models learn from limited discharge data?},
  journal = {Hydrology and earth system sciences},
  year = {2025},
  doi = {10.5194/hess-29-5005-2025},
  url = {https://doi.org/10.5194/hess-29-5005-2025}
}

Original Source: https://doi.org/10.5194/hess-29-5005-2025