Najafzadeh et al. (2026) Assessment of flood susceptibility in Minab County, Iran, through the integration of topographic, climatic, and land-surface indices using ensemble machine learning models
Identification
- Journal: Journal of Hydrology Regional Studies
- Year: 2026
- Date: 2026-03-13
- Authors: Mohammad Najafzadeh, Mohadeseh Shahsavari
- DOI: 10.1016/j.ejrh.2026.103327
Research Groups
Department of Water Engineering, Faculty of Civil and Surveying Engineering, Graduate University of Advanced Technology, Kerman, Iran
Short Summary
This study developed a high-resolution flood susceptibility map for Minab County, Iran, by integrating multi-source geospatial datasets with seven machine learning models. It found that ensemble tree-based models (CatBoost and Random Forest) provided the most balanced and generalizable performance, identifying short-term precipitation and surface moisture as dominant flood drivers, with approximately 53% of the study area classified as high to very high flood risk.
Objective
- To develop an event-informed flood susceptibility framework for Minab County, southern Iran, by integrating multi-source geospatial datasets (topographic, climatic, soil, and spectral indices) within a comparative ensemble machine learning environment.
- To systematically evaluate seven predictive machine learning models (XGBoost, AdaBoost, CatBoost, Random Forest, Model Tree, K-Nearest Neighbors, and Support Vector Machine) to examine the trade-off between predictive accuracy and generalization stability.
- To employ Shapley Additive Explanations (SHAP) analysis to identify dominant flood drivers and ensure physical interpretability of the results.
Study Configuration
- Spatial Scale: Minab County, Hormozgan Province, southern Iran, covering the rural districts of Humeh (280.271 km²), Bandzarak (199.41 km²), and Tiab (293.893 km²), totaling approximately 773.574 km². Geographical coordinates: 56°48′ to 57°12′ E longitude and 27°00′ to 27°16′ N latitude. All data were resampled to a 30 m spatial resolution.
- Temporal Scale: The study focused on the January 2022 flood event. Precipitation data (MMP, MXMP) covered January of each year from 1997 to 2021, while P1DBF and P2DBF covered January 1 and 2, 2022. SRTM DEM data were acquired in February 2000. Landsat-8 imagery with less than 5% cloud cover was used.
Methodology and Data
- Models used: Extreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), Categorical Boosting (CatBoost), Random Forest (RF), Model Tree (MT), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM).
- Data sources:
- Satellite/Remote Sensing:
- Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) (30 m spatial resolution) for topographic indices: Elevation (El), Slope (Sl), Slope Aspect (SA), River Distance (RD), Waterway and River Density (WRD), Curvature (Cu), and Topographic Wetness Index (TWI).
- Landsat-8 satellite imagery (30 m spatial resolution) for spectral and land-surface indices: Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI), Land Surface Water Index (LSWI), and Land Use/Land Cover (LULC).
- Observation/Reanalysis:
- Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) (0.05° (~5 km) native spatial resolution, resampled to 30 m) for climatic indices: Mean of Monthly Precipitation (MMP), Maximum Monthly Precipitation (MXMP), Precipitation two days before the Flood (P2DBF), and Precipitation one day before the Flood (P1DBF).
- USDA global soil database for soil texture (ST) data.
- 585 flood inventory points from the January 2022 flood event, collected via field surveys, eyewitness accounts, and official reports, used for training (75%) and testing (25%) the models. These points were classified into five flood risk categories based on observed inundation depth.
- Satellite/Remote Sensing:
Main Results
- Ensemble tree-based models generally outperformed non-ensemble approaches. CatBoost and Random Forest demonstrated the most balanced and generalizable performance, with Area Under the Curve (AUC) values of 0.991 and 0.970, respectively.
- XGBoost and AdaBoost models achieved near-perfect statistical metrics (AUC = 0.9999, RMSE < 0.002, MAPE < 0.25%) in both training and testing, suggesting potential overfitting.
- SHAP analysis identified short-term precipitation prior to the flood event (P1DBF, P2DBF) and surface moisture conditions (LSWI, NDWI) as the dominant drivers of flood susceptibility. Elevation, slope gradient, and distance to river were secondary controls.
- The final flood susceptibility map indicated that approximately 53% of the study area falls within high to very high flood-risk classes, concentrated along the Minab River and surrounding agricultural plains.
- Entropy-based uncertainty analysis confirmed that boosting algorithms (XGBoost, AdaBoost) exhibited the lowest uncertainty, while CatBoost and Random Forest showed moderate and stable uncertainty, and SVM consistently had the highest uncertainty.
Contributions
- Integration of hydrological, topographic, climatic, soil, and spectral predictors within a unified modeling framework.
- Comparative evaluation of multiple ensemble and non-ensemble machine learning models under a consistent validation structure.
- Incorporation of SHAP analysis to enhance transparency and interpretability of flood drivers.
- Development of a spatially coherent flood susceptibility map supporting risk-informed planning in an arid basin context.
Funding
No funds, grants, or other support was received.
Citation
@article{Najafzadeh2026Assessment,
author = {Najafzadeh, Mohammad and Shahsavari, Mohadeseh},
title = {Assessment of flood susceptibility in Minab County, Iran, through the integration of topographic, climatic, and land-surface indices using ensemble machine learning models},
journal = {Journal of Hydrology Regional Studies},
year = {2026},
doi = {10.1016/j.ejrh.2026.103327},
url = {https://doi.org/10.1016/j.ejrh.2026.103327}
}
Original Source: https://doi.org/10.1016/j.ejrh.2026.103327