Kareem et al. (2025) Runoff prediction under climatic variability using SWAT and machine learning models: a case study of the Hunza River basin

Identification

Journal: Theoretical and Applied Climatology
Year: 2025
Date: 2025-12-01
Authors: Muhammad Ghawas Kareem, Tang De-shan, Muhammad Farhan, Anis ur Rehman Khalil, Hafiz Ahmad Hammad Abid
DOI: 10.1007/s00704-025-05944-8

Research Groups

College of Water Conservancy and Hydropower Engineering, Hohai University, Nanjing, China
School of Earth Sciences and Engineering, Hohai University, Nanjing, China
State Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China
Centre of Excellence in Water Resources Engineering, University of Engineering & Technology, Lahore, Pakistan

Short Summary

This study evaluates and compares six models (five machine learning and the physically-based SWAT model) for monthly runoff prediction in the glacier-fed Hunza River Basin (Pakistan) from 2007 to 2022. The research found that the XGBoost machine learning model significantly outperformed the other models, including SWAT, in predictive accuracy under climatic variability, though all models struggled with extreme runoff events.

Objective

To evaluate and compare the performance of five machine learning models (XGBoost, LSTM, ANN, SVR, KNN) and the physically-based Soil and Water Assessment Tool (SWAT) model for monthly runoff prediction in the glacier-fed Hunza River Basin.
To integrate multiple climatic factors, including precipitation, temperature, relative humidity, wind speed, and snow cover, as inputs for runoff prediction.
To hypothesize that machine learning models will outperform the SWAT model in runoff prediction accuracy in this climate-sensitive region.

Study Configuration

Spatial Scale: Hunza River Basin (HRB), located in the Gilgit-Baltistan region of the Western Karakoram Mountains, Pakistan. The basin covers an area of approximately 13,567.23 square kilometers, with elevations ranging from 1,370 meters to 7,885 meters.
Temporal Scale: The study period for data collection and model evaluation was from 2007 to 2022, with a monthly time step. SWAT model phases included warm-up (2007–2008), calibration (2009–2015), and validation (2016–2022). Machine learning models were trained from 2007 to 2015 and tested from 2016 to 2022.

Methodology and Data

Models used:
- Soil and Water Assessment Tool (SWAT)
- Artificial Neural Network (ANN)
- Support Vector Regression (SVR)
- K-Nearest Neighbors (KNN)
- Long Short-Term Memory (LSTM)
- Extreme Gradient Boosting (XGBoost)
Data sources:
- Climatic data: Monthly precipitation, minimum and maximum temperature, wind speed, and relative humidity from the Pakistan Meteorological Department (PMD) for 2007–2022.
- Runoff data: Monthly observed runoff data from the Water and Power Development Authority (WAPDA) for 2007–2022.
- Spatial data: Digital Elevation Model (DEM) and Land Use and Land Cover (LULC) data from Sentinel-2 imagery (10 meters spatial resolution), and a Soil map from FAO (1:500,000 scale).
- Snow cover data: Monthly aggregated snow cover data from the Moderate Resolution Imaging Spectroradiometer (MODIS) Terra (MOD10A1) product via Google Earth Engine (GEE).

Main Results

XGBoost demonstrated the highest predictive accuracy and stability among all models, achieving an overall Nash-Sutcliffe Efficiency (NSE) of 0.897, a Root Mean Square Error (RMSE) of 52.613 cubic meters per second (m³/s), and a Mean Absolute Percentage Error (MAPE) of 20.195%. Its Percent Bias (PBIAS) of -7.923% indicated a minor underestimation during high-flow periods.
LSTM performed second best with an NSE of 0.862, followed by ANN (NSE = 0.859), KNN (NSE = 0.843), and SVR (NSE = 0.741 during testing).
The physically-based SWAT model showed the lowest predictive performance for short-term runoff, with an NSE of 0.811 during calibration and a decline to 0.696 during validation. It had the highest RMSE (89.440 m³/s) and MAPE (40.574%), and a PBIAS of 10.04% during validation, indicating overestimation.
All models, including XGBoost, struggled to accurately predict extreme runoff events, particularly underestimating peak flows observed in 2020, 2021, and 2022. For instance, the observed peak runoff of 726.57 m³/s in July 2020 was underestimated by XGBoost by approximately 322.39 m³/s.
Statistical analysis (ANOVA and Tukey’s HSD tests) confirmed statistically significant differences in mean residuals and error variability across models, with XGBoost exhibiting lower error variability and statistically superior performance compared to SWAT.
XGBoost showed computational efficiency advantages, requiring substantially lower training time and memory compared to deep learning models like LSTM and ANN.

Contributions

This study provides a comprehensive comparative evaluation of diverse modeling approaches (physically-based, classical machine learning, deep learning, and ensemble machine learning) for runoff prediction in a complex, glacier-fed basin.
It highlights the importance and effectiveness of integrating multiple climatic factors, including snow cover, relative humidity, and wind speed, as inputs for runoff prediction in such regions.
The research identifies XGBoost as the most robust and accurate model for monthly runoff prediction in the Hunza River Basin under climatic variability, offering a valuable tool for water resource management.
It critically assesses the limitations of both machine learning and physically-based models in accurately capturing and predicting extreme runoff events in glacier-fed environments, pointing to areas for future research and model refinement.
The findings are crucial for optimizing water resource management, improving flood control, and supporting strategic planning in regions significantly affected by climate change.

Funding

This study was conducted as part of Hohai University academic requirements with no external funding.

Citation

@article{Kareem2025Runoff,
  author = {Kareem, Muhammad Ghawas and De-shan, Tang and Farhan, Muhammad and Khalil, Anis ur Rehman and Abid, Hafiz Ahmad Hammad},
  title = {Runoff prediction under climatic variability using SWAT and machine learning models: a case study of the Hunza River basin},
  journal = {Theoretical and Applied Climatology},
  year = {2025},
  doi = {10.1007/s00704-025-05944-8},
  url = {https://doi.org/10.1007/s00704-025-05944-8}
}

Original Source: https://doi.org/10.1007/s00704-025-05944-8