Shirzadi et al. (2025) Leveraging imbalanced dataset in urban flood susceptibility prediction: A case study of Sanandaj City
Identification
- Journal: Journal of Hydrology
- Year: 2025
- Date: 2025-12-06
- Authors: Ataollah Shirzadi, Himan Shahabi, Aryan Salvati, Ehsan Jafari Nodoushan, Sayyed Mohammad Hoseini, Marzieh Hajizadeh Tahan, John J. Clague
- DOI: 10.1016/j.jhydrol.2025.134727
Research Groups
- Department of Rangeland and Watershed Management, Faculty of Natural Resources, University of Kurdistan, Sanandaj, Iran
- Departments of Geomorphology, Faculty of Natural Resources, University of Kurdistan, Sanandaj, Iran
- Department of Arid and Mountainous Regions Reclamation, Faculty of Natural Resources, University of Tehran, Karaj, Iran
- Department of Civil Engineering, Campus of Bijar, University of Kurdistan, Sanandaj, Iran
- Gahar Artificial Intelligence Research Group, Ayatollah Boroujerdi University, Boroujerd, Iran
- Department of Computer Engineering, Faculty of Engineering, Meybod University, Meybod, Iran
- Department of Earth Sciences, Simon Fraser University, Burnaby, BC, Canada
Short Summary
This study investigates the utility of imbalanced datasets for urban flood susceptibility prediction in Sanandaj City, Iran, comparing hybrid machine learning (RFADT) and deep learning (CNN) models, and finds that imbalanced datasets significantly enhance prediction accuracy compared to balanced ones.
Objective
- To evaluate the effectiveness of utilizing imbalanced datasets in urban flood susceptibility prediction by comparing hybrid machine learning (Rotation Forest-based Alternating Decision Tree, RFADT) and deep learning (Convolutional Neural Network, CNN) models against traditional balanced datasets.
Study Configuration
- Spatial Scale: Sanandaj City, Iran.
- Temporal Scale: Historical flood events (174 locations used for model training and validation).
Methodology and Data
- Models used: Alternating Decision Tree (ADT), Rotation Forest-based Alternating Decision Tree (RFADT), Convolutional Neural Network (CNN).
- Data sources: 174 historical flood locations (observation data), 19 conditioning factors (geospatial data).
Main Results
- RFADT and CNN models generally outperformed the ADT model across various dataset configurations.
- All models (ADT, RFADT, and CNN) demonstrated a significant and consistent increase in Area Under the Curve (AUC) as the size of the imbalanced dataset increased from 1n to 10n (where 'n' is the number of flood events).
- Imbalanced datasets, particularly those with 6n, 8n, and 10n ratios, proved more effective for reliable urban flood susceptibility mapping than balanced datasets (e.g., AUC for ADT with balanced dataset was 0.774).
- Best model performances with imbalanced datasets:
- ADT model achieved its best performance with a 10n imbalanced dataset (MAE = 0.215, RMSE = 0.282, AUC = 0.850).
- RFADT model performed best with an 8n imbalanced dataset (MAE = 0.216, RMSE = 0.282, AUC = 0.912) and a 6n dataset (MAE = 0.261, RMSE = 0.317, AUC = 0.884).
- CNN model achieved its best performance with an 8n imbalanced dataset (MAE = 0.102, RMSE = 0.319, AUC = 0.925) and a 10n dataset (MAE = 0.086, RMSE = 0.293, AUC = 0.898).
Contributions
- Demonstrates that leveraging imbalanced datasets, specifically with ratios of 6n, 8n, and 10n, significantly improves the accuracy and reliability of urban flood susceptibility prediction compared to using balanced datasets.
- Provides a valuable methodological approach for urban flood mapping, particularly in scenarios where balanced datasets may lead to suboptimal model performance.
Funding
- No funding information was provided in the article text.
Citation
@article{Shirzadi2025Leveraging,
author = {Shirzadi, Ataollah and Shahabi, Himan and Salvati, Aryan and Nodoushan, Ehsan Jafari and Hoseini, Sayyed Mohammad and Tahan, Marzieh Hajizadeh and Clague, John J.},
title = {Leveraging imbalanced dataset in urban flood susceptibility prediction: A case study of Sanandaj City},
journal = {Journal of Hydrology},
year = {2025},
doi = {10.1016/j.jhydrol.2025.134727},
url = {https://doi.org/10.1016/j.jhydrol.2025.134727}
}
Original Source: https://doi.org/10.1016/j.jhydrol.2025.134727