Haghizadeh et al. (2026) Preparation of flood potential maps using machine learning and comparison of their performance
Identification
- Journal: Natural Hazards
- Year: 2026
- Date: 2026-01-01
- Authors: Ali Haghizadeh, Tayebeh Sepahvand, Leila Ghasemi, Babak Shahinejad
- DOI: 10.1007/s11069-025-07777-0
Research Groups
- Department of Watershed Management Engineering, Faculty of Natural Resources, Lorestan University, Khorramabad, Iran
- Department of Water Engineering, Faculty of Agriculture, Lorestan University, Khorram Abad, Lorestan, Iran
Short Summary
This study developed flood potential maps for the Borujerd-Dorud basin, Iran, by evaluating and comparing six machine learning models (Deep Learning, CatBoost, XGBoost, Random Forest, K-Nearest Neighbors, Support Vector Machine). The Random Forest model demonstrated the highest accuracy (AUC = 0.71), identifying distance from rivers as the most influential factor for flood vulnerability.
Objective
- To create a flood potential map for the Borujerd-Dorud basin of Lorestan province, Iran.
- To evaluate and compare the performance of six machine learning models (Deep Learning, CatBoost, XGBoost, Random Forest, K-Nearest Neighbors, Support Vector Machine) for flood susceptibility mapping.
- To identify the most accurate model for delineating flood-prone areas and determine the most influential factors.
Study Configuration
- Spatial Scale: Borujerd-Dorud basin (Tireh watershed) in Lorestan province, Iran, covering 2127 square kilometers. Input layers were prepared in raster format with a pixel size of 12.5 meters. Spatial group k-fold cross-validation used 1 kilometer by 1 kilometer grid cells.
- Temporal Scale: Flood-prone locations identified in 2024. Land Use/Land Cover data derived from Sentinel-1 and Sentinel-2 imagery from 2022. Precipitation data based on a 24-hour annual rainfall with a 100-year return period.
Methodology and Data
- Models used: Deep Learning (DNN), Extreme Gradient Boosting (XGBoost), Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), CatBoost.
- Data sources:
- Flood-prone locations: 270 points from the Regional Water Company of Lorestan.
- Input layers (raster, 12.5 meters pixel size):
- Land Use/Land Cover (LU/LC): Derived from Sentinel-2 optical imagery and Sentinel-1 radar data (2022) using Google Earth Engine.
- Geology: Lithological units (metamorphic rocks, Cretaceous limestone, Garin Formation limestone, Quaternary sediments).
- Slope: Derived from a 12.5-meter ALOS Digital Elevation Model (DEM), categorized into five groups (0–9%, 9–44%, 44–72%, 72–138%, >138%).
- Aspect: Derived from the 12.5-meter ALOS DEM.
- Precipitation: 24-hour annual rainfall with a 100-year return period, computed based on fractal theory.
- Soil Hydrological Group: Classified into A, B, and D categories following the USDA Soil Conservation Service (SCS) system (Group A: > 0.0076 meters per hour infiltration; Group B: 0.0038–0.0076 meters per hour infiltration; Group D: < 0.0038 meters per hour infiltration).
- Infiltration: Prepared based on the soil hydrological group, divided into three groups: less than 0.003, between 0.01 and 0.02, and greater than 0.02 (units not specified, likely a dimensionless index or specific rate).
- Distance from rivers: Derived from DEM, categorized into thresholds (50 meters, 100 meters, 200 meters).
- Validation: Spatial group k-fold cross-validation (5 folds) using 1 kilometer by 1 kilometer grid cells to ensure spatial independence.
- Evaluation metrics: Area Under the Curve (AUC), Accuracy, F1-score, Recall.
Main Results
- All six machine learning models were capable of generating flood potential maps for the Borujerd-Dorud basin.
- The Random Forest (RF) model demonstrated the highest performance with an Area Under the Curve (AUC) of 0.71.
- The K-Nearest Neighbors (KNN) model showed the lowest accuracy among the evaluated models (AUC = 0.65).
- Distance from rivers was identified as the most significant influential factor for flood vulnerability across all models, followed by geology.
- The central section of the Borujerd-Dorud basin exhibited the highest flood susceptibility potential due to its proximity to the main river, lower slope, and lower elevation.
- Model performance metrics:
- Random Forest (RF): AUC = 0.71, Recall = 0.67, Accuracy = 0.66, F1-score = 0.67
- XGBoost (XGB): AUC = 0.69, Recall = 0.68, Accuracy = 0.67, F1-score = 0.68
- CatBoost: AUC = 0.69, Recall = 0.72, Accuracy = 0.66, F1-score = 0.68
- Deep Learning (DNN): AUC = 0.62, Recall = 0.79, Accuracy = 0.61, F1-score = 0.67
- Support Vector Machine (SVM): AUC = 0.68, Recall = 0.66, Accuracy = 0.65, F1-score = 0.66
- K-Nearest Neighbors (KNN): AUC = 0.65, Recall = 0.65, Accuracy = 0.63, F1-score = 0.64
- Bootstrap-based uncertainty analysis revealed high-confidence zones in urban cores and well-documented floodplains, transitional zones in agricultural areas, and high-uncertainty zones in peri-urban fringes and topographic depressions.
Contributions
- Simultaneous application and comparative evaluation of six diverse machine learning models (Deep Learning, XGBoost, Random Forest, K-Nearest Neighbors, Support Vector Machine, and CatBoost) for flood susceptibility mapping in the Borujerd-Dorud plain, Iran.
- Methodological advancement in deriving land use data from Sentinel-2 imagery for enhanced Curve Number (CN) estimation.
- Generation of a precipitation layer based on fractal theory for high accuracy in flood modeling.
- Implementation of a rigorous quantitative validation framework using AUC and spatial group k-fold cross-validation to ensure spatial independence and reliable risk assessment.
- Identification of distance from rivers and geology as the most influential factors for flood vulnerability in the specific study area, providing actionable insights for local planning and management.
Funding
No funding was received for conducting this study.
Citation
@article{Haghizadeh2026Preparation,
author = {Haghizadeh, Ali and Sepahvand, Tayebeh and Ghasemi, Leila and Shahinejad, Babak},
title = {Preparation of flood potential maps using machine learning and comparison of their performance},
journal = {Natural Hazards},
year = {2026},
doi = {10.1007/s11069-025-07777-0},
url = {https://doi.org/10.1007/s11069-025-07777-0}
}
Original Source: https://doi.org/10.1007/s11069-025-07777-0