Wu et al. (2026) Modeling runoff with incomplete data: a comparison of hydrological, deep learning, and hybrid approaches

Identification

Journal: Journal of Hydrology
Year: 2026
Date: 2026-02-14
Authors: Jiarui Wu, Conrad Zorn, Weiru Zhao, Björn Klöve, Wen Liu, Wenzhou Guo, Beibei Wang, Shengchao Qiao, Chaoqing Yu, Xiao Huang, Chao Wang
DOI: 10.1016/j.jhydrol.2026.135132

Research Groups

Center for Eco-Environment Restoration Engineering of Hainan Province, School of Ecology, Hainan University, Haikou 570228, China
Key Laboratory of Integrated Regulation and Resource Development on Shallow Lakes, Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China
Department of Civil and Environmental Engineering, University of Auckland, Auckland 1010, New Zealand
College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China
Water Resources and Environmental Engineering Research Unit, Faculty of Technology, University of Oulu, 90014, Finland
The National Key Laboratory of Water Disaster Prevention, College of Hydrology and Water Resources, Hohai University, Nanjing, Jiangsu 210098, China

Short Summary

This study systematically evaluates hydrological, deep learning, and hybrid runoff models under various data scarcity scenarios across forty catchments. It finds process-based models more reliable in data-scarce conditions, while hybrid models effectively combine physical knowledge with data-driven flexibility, underscoring the importance of model selection based on data availability.

Objective

To systematically evaluate the performance and sensitivity of physical process-based, data-driven, and hybrid runoff models under varying degrees and types of input data scarcity.

Study Configuration

Spatial Scale: Forty catchments in different regions.
Temporal Scale: Three typical missing-data scenarios and ten levels of data sparsity were simulated to represent varying data availability.

Methodology and Data

Models used: Xinanjiang (XAJ) model (physical process-based), Long Short-Term Memory (LSTM) model (data-driven), META (hybrid), Guided-LSTM (hybrid).
Data sources: Existing hydrological data from forty catchments, with artificially introduced data gaps to simulate scarcity.

Main Results

Deep-learning models (LSTM) demonstrated high sensitivity to data availability and the coverage of flow regimes and extremes within the dataset.
The process-based Xinanjiang (XAJ) model maintained reliability under data-scarce conditions, achieving mean Nash-Sutcliffe Efficiency (NSE) and Kling-Gupta Efficiency (KGE) values both exceeding 0.7.
Hybrid models (META and Guided-LSTM) effectively combined physical process knowledge with data-driven flexibility, showing benefits under different data conditions.

Contributions

Provides a systematic evaluation of three distinct categories of runoff models (process-based, data-driven, and hybrid) under various data scarcity conditions.
Enhances understanding of model performance and sensitivity in data-scarce hydrological environments, addressing a previously limited area of research.
Emphasizes the critical role of data availability in guiding appropriate model selection for practical hydrological applications.

Funding

Not specified in the provided text.

Citation

@article{Wu2026Modeling,
  author = {Wu, Jiarui and Zorn, Conrad and Zhao, Weiru and Klöve, Björn and Liu, Wen and Guo, Wenzhou and Wang, Beibei and Qiao, Shengchao and Yu, Chaoqing and Huang, Xiao and Wang, Chao},
  title = {Modeling runoff with incomplete data: a comparison of hydrological, deep learning, and hybrid approaches},
  journal = {Journal of Hydrology},
  year = {2026},
  doi = {10.1016/j.jhydrol.2026.135132},
  url = {https://doi.org/10.1016/j.jhydrol.2026.135132}
}

Original Source: https://doi.org/10.1016/j.jhydrol.2026.135132