Li et al. (2025) A Comparative Study of Urban Pluvial Flood Susceptibility Assessment Based on Multi-Machine Learning Algorithm
Identification
- Journal: Water Resources Management
- Year: 2025
- Date: 2025-12-29
- Authors: Yong Li, Z. Fang, Jun Liu, Zhengsheng Lu, Hong Zhou, Wenhao Yin, Xiaolan Chen
- DOI: 10.1007/s11269-025-04414-0
Research Groups
- College of Hydrology and Water Resources, Hohai University, Nanjing, China
- Shanghai Xunxiang Water Conservancy Engineering Co., LTD, Shanghai, China
Short Summary
This study developed and benchmarked a multi-machine learning framework for urban pluvial flood susceptibility assessment in Wuxi, China, finding that the Particle Swarm Optimization-optimized eXtreme Gradient Boosting (PSO-XGB) model achieved superior predictive performance and spatial delineation compared to other models.
Objective
- To identify nine key factors influencing pluvial flood susceptibility and develop six machine learning models using a dataset of 200 balanced flood and non-flood samples.
- To evaluate and compare the performance of these models using multiple metrics (accuracy, precision, recall, F1-score, AUC, RMSE) under different hyperparameter optimization strategies (Bayesian Optimization and Particle Swarm Optimization).
- To apply SHapley Additive exPlanations (SHAP) for assessing the importance of explanatory variables and exploring their interactions, thereby uncovering underlying mechanisms of pluvial flood susceptibility.
- To construct spatial flood risk maps, analyze the spatial distribution of high-risk areas identified by each model, and evaluate their ability to detect known inundation points.
Study Configuration
- Spatial Scale: Central urban area of Wuxi, Jiangsu Province, China, covering approximately 148.7 square kilometers. All spatial data were processed to a consistent 30 meter x 30 meter raster resolution.
- Temporal Scale: Pluvial flood season typically occurs from May to September; non-flood season from October to April. Historical flood event data were collected, but the study focuses on static susceptibility assessment rather than real-time forecasting.
Methodology and Data
- Models used:
- Machine Learning Algorithms: Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting (GB), eXtreme Gradient Boosting (XGB).
- Hyperparameter Optimization: Bayesian Optimization (BO), Particle Swarm Optimization (PSO).
- Multicollinearity Analysis: Variance Inflation Factor (VIF).
- Model Interpretation: SHapley Additive exPlanations (SHAP).
- Data sources:
- Pluvial flood inventory: 200 balanced samples (100 flood, 100 non-flood points) collected from local governments, municipal departments, street-level administrative offices, news reports, and social media posts.
- Explanatory variables (9 factors, 30 meter resolution):
- Annual maximum daily precipitation (AP): High-resolution daily gridded dataset for mainland China (National Tibetan Plateau Science Data Center).
- Elevation (EV), Slope (SL), Aspect (AS): 30-meter resolution Digital Elevation Model (DEM) from ESA Copernicus Panda.
- Normalized Difference Vegetation Index (NDVI): Derived from satellite imagery.
- Green space area (GA): SinoLC-1 dataset (1 meter resolution land-cover classification).
- Distance to rivers (DRI): GIS-based Euclidean distance analysis.
- Population density (PD): LandScan dataset (1 kilometer resolution global population distribution data).
- Gross domestic product (GDP): Not explicitly stated source, but used as a spatial proxy.
- Pipeline network density (DPN): Calculated based on pipe length and area.
Main Results
- Multicollinearity analysis reduced the initial 10 explanatory factors to 9 (Elevation, Annual Maximum Daily Precipitation, Population Density, NDVI, GDP, Distance to Rivers, Green Space Area, Aspect, and Drainage Network Density), with all final VIF values below 3.
- The PSO-XGB model achieved the highest predictive performance on the test set, with an accuracy of 0.933, precision of 0.964, recall of 0.900, F1-score of 0.931, AUC of 0.956, and the lowest Root Mean Square Error (RMSE) of 0.2582. It significantly outperformed BO-XGB (test RMSE = 0.3651) and all single machine learning models.
- Spatial analysis revealed that PSO-XGB correctly identified 92% of known pluvial flood points, with the highest density (5.13 points per square kilometer) concentrated in high-susceptibility zones. Approximately 53% of the study area was classified as low or lowest susceptibility by PSO-XGB. High and highest susceptibility zones were predominantly located in the central urban area and near Lihu Lake, attributed to rapid urbanization, high building density, extensive road networks, and lower elevation.
- SHAP analysis identified elevation (0.55), annual maximum daily precipitation (0.51), and population density (0.48) as the most significant predictors of pluvial flood susceptibility. Elevation showed a negative correlation with flood risk, while precipitation and population density had positive correlations. Drainage network density was a secondary but relevant factor.
Contributions
- Developed a novel multi-model benchmarking framework for urban pluvial flood susceptibility assessment, systematically comparing individual and optimized ensemble machine learning models.
- Demonstrated the superior predictive accuracy and spatial robustness of the PSO-XGB model for urban pluvial flood susceptibility mapping, providing a highly effective tool for risk assessment.
- Enhanced model interpretability through the application of SHAP, offering transparent insights into the key drivers of pluvial flood risk, including the significant role of anthropogenic factors like population density.
- Generated high-resolution spatial flood susceptibility maps that accurately delineate flood-prone zones, offering valuable scientific support for urban planning, infrastructure design, and emergency management.
- Addressed a research gap by conducting a comprehensive comparative evaluation of multiple ML models with hyperparameter optimization using standardized datasets and consistent performance metrics.
Funding
- National Natural Science Foundation of China (NO. 42301015)
- Natural Science Foundation of Jiangsu Province, China (NO. BK20230961)
Citation
@article{Li2025Comparative,
author = {Li, Yong and Fang, Z. and Liu, Jun and Lu, Zhengsheng and Zhou, Hong and Yin, Wenhao and Chen, Xiaolan},
title = {A Comparative Study of Urban Pluvial Flood Susceptibility Assessment Based on Multi-Machine Learning Algorithm},
journal = {Water Resources Management},
year = {2025},
doi = {10.1007/s11269-025-04414-0},
url = {https://doi.org/10.1007/s11269-025-04414-0}
}
Original Source: https://doi.org/10.1007/s11269-025-04414-0