Khan et al. (2025) Climate-driven flood hazard assessment in data-scarce mountainous basins using a GIS-based machine learning and hydrodynamic modelling under CMIP6 SSP scenarios
Identification
- Journal: Scientific Reports
- Year: 2025
- Date: 2025-12-14
- Authors: Shahbaz Khan, Afed Ullah Khan, Abdullah Alodah, Ahmad Azeem, Muhammad Waqas, Faten Nahas, Nazih Y. Rebouh, Youssef M. Youssef
- DOI: 10.1038/s41598-025-31390-7
Research Groups
- Department of Civil Engineering, University of Engineering and Technology Peshawar, Bannu Campus, Pakistan
- National Institute of Urban Infrastructure Planning, University of Engineering and Technology Peshawar, Pakistan
- Department of Civil Engineering, College of Engineering, Qassim University, Buraidah, Saudi Arabia
- Department of Geography, College of Humanities and Social Sciences, King Saud University, Riyadh, Saudi Arabia
- Institute of Environmental Engineering, RUDN University, Moscow, Russia
- Geological and Geophysical Engineering Department, Faculty of Petroleum and Mining Engineering, Suez University, Suez, Egypt
Short Summary
This study developed a hybrid framework combining explainable AI (SHAP-XGBoost, Random Forest) and coupled hydrologic-hydraulic modeling (HEC-HMS–HEC-RAS) to assess climate-driven flood hazards in data-scarce mountainous basins under CMIP6 SSP scenarios. It found a substantial increase in flood hazard under future scenarios, particularly SSP585, where high and very high hazard zones expanded significantly.
Objective
- To develop a novel hybrid framework integrating explainable AI, ensemble climate modeling, and coupled hydrologic-hydraulic simulations to assess current and future climate-driven flood hazards in data-scarce mountainous basins (specifically the Swat River Basin) under CMIP6 SSP245 and SSP585 scenarios.
Study Configuration
- Spatial Scale: Swat River Basin (SRB), approximately 5337 square kilometers. Digital Elevation Models (DEMs) at 30 meters and 10 meters resolution were used, with computational meshes refined to 10 meters.
- Temporal Scale:
- Observed climate and streamflow data: 1993–2020.
- Historical Global Climate Model (GCM) period: 1993–2014.
- Hydrological model calibration period: 1993–2013.
- Hydrological model validation period: 2014–2019.
- Future climate projections (SSP245 and SSP585): 2015–2099.
- Hydraulic simulation duration for synthetic hydrographs: 100 hours.
Methodology and Data
- Models used:
- Bias Correction: Linear Scaling method.
- GCM Ranking: XGBoost regression with SHapley Additive exPlanations (SHAP) interpretation.
- Multi-Model Ensemble (MME): Random Forest (RF) regression.
- Hydrological Modeling: HEC-HMS v4.12 (utilizing SCS Curve Number method for infiltration, SCS Unit Hydrograph method for runoff transformation, and Monthly Constant Flow for baseflow).
- Flood Frequency Analysis: Easyfit software (fitting Log-Logistic, Gumbel, and Generalized Extreme Value (GEV) distributions) combined with the VIKOR multi-criteria decision-making model.
- Hydraulic Modeling: HEC-RAS 2D v6.0.
- Hazard Assessment: GIS-based composite index (H = D x V, where D is flood depth in meters and V is flow velocity in meters per second).
- Data sources:
- Observed Climate Data (daily precipitation, maximum and minimum temperature): Pakistan Meteorological Department (PMD) from Kalam station (1993-2020).
- Observed Streamflow Data (daily discharge): Water and Power Development Authority (WAPDA) from Chakdara station (1993-2020).
- Global Climate Models (Precipitation, maximum and minimum temperature): 11 CMIP6 GCMs (NESM3, CMCC-ESM2, CNRM-CM6-1, CNRM-ESM2-1, EC-Earth3-Veg-LR, GFDL-ESM4, INM-CM4-8, INM-CM5-0, MIROC6, MRI-ESM2-0, Nor-ESM2-MM) from the CMIP6 Repository (Historical 1993-2014, SSP245 & SSP585 2015-2099).
- Digital Elevation Model (DEM): NASA Earth Data Portal (30 meters resolution) and Google Earth Engine (GEE) (10 meters resolution).
- Land Use / Land Cover: Processed via GIS (referenced hydrologydata.org CN raster).
- Soil Data (Hydrological Groups): https://hydrologydata.org.
Main Results
- The linear scaling method effectively reduced systematic biases in GCM outputs for daily precipitation, maximum, and minimum temperatures.
- XGBoost regression with SHAP interpretation achieved high predictive accuracy for GCM ranking (R²: 0.934/0.923 for precipitation, 0.956/0.948 for maximum temperature, 0.947/0.939 for minimum temperature).
- The Random Forest Multi-Model Ensemble (MME) further improved climate projection performance (R²: 0.742/0.725 for precipitation; 0.971/0.965 for maximum temperature; 0.963/0.957 for minimum temperature).
- The HEC-HMS hydrological model was calibrated (1993–2013) and validated (2014–2019) with satisfactory results (Nash–Sutcliffe Efficiency (NSE): 0.612/0.603; Percent Bias (PBIAS): +3.96%/–6.75%).
- Flood frequency analysis identified Log-Logistic (historical), Gumbel (SSP245), and GEV (SSP585) as the most appropriate probability distributions for different scenarios.
- Under the SSP585 scenario, high to very high flood hazard zones expanded to 78% of the floodplain for the 100-year return period event, compared to 69% under historical conditions for the same return period.
- Future projections indicate increased peak discharges and higher flood risk, particularly under the high-emission SSP585 scenario, with significant increases in flood depth and velocity.
Contributions
- This study is the first to unify SHAP-informed GCM selection, machine learning-based ensemble modeling, statistical bias correction, frequency-based flood estimation, and 2D physically-based hydraulic simulation into a single interpretable framework.
- The framework is specifically tailored for the Swat River Basin, a snow-fed, topographically complex, and data-limited watershed.
- It provides an integrated, explainable, and scalable workflow for transparent flood hazard assessment in complex, data-scarce mountain basins under climate change.
- The framework offers a practical decision-support system for climate-resilient spatial planning and is adaptable to other data-limited or topographically complex basins.
Funding
- Financial support was provided by the Deanship of Graduate Studies and Scientific Research at Qassim University (reference code: QU-APC-2025).
Citation
@article{Khan2025Climatedriven,
author = {Khan, Shahbaz and Khan, Afed Ullah and Alodah, Abdullah and Azeem, Ahmad and Waqas, Muhammad and Nahas, Faten and Rebouh, Nazih Y. and Youssef, Youssef M.},
title = {Climate-driven flood hazard assessment in data-scarce mountainous basins using a GIS-based machine learning and hydrodynamic modelling under CMIP6 SSP scenarios},
journal = {Scientific Reports},
year = {2025},
doi = {10.1038/s41598-025-31390-7},
url = {https://doi.org/10.1038/s41598-025-31390-7}
}
Original Source: https://doi.org/10.1038/s41598-025-31390-7