Yu et al. (2026) Region-specific assessment of flood disaster risk and contributing factors, based on historical data and machine learning
Identification
- Journal: Natural Hazards
- Year: 2026
- Date: 2026-01-01
- Authors: Yang Yu, Wen Zhu, Qiuan Zhu, Jiaxin Jin, Shanhu Jiang, Shanshui Yuan, Xiaoli Yang, Xiaoxiang Zhang, Liliang Ren, Xiuqin Fang
- DOI: 10.1007/s11069-025-07827-7
Research Groups
- College of Geography and Remote Sensing, Hohai University, Nanjing, China
- State Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China
- College of Hydrology and Water Resources, Hohai University, Nanjing, China
- Yangtze Institute for Conservation and Development, Hohai University, Nanjing, China
- Key Laboratory of Hydrologic-Cycle and Hydrodynamic-System of Ministry of Water Resources, Hohai University, Nanjing, China
Short Summary
This study globally assessed major flood disaster risk and its contributing factors using historical data and machine learning, revealing significant regional heterogeneity in vulnerability and the primary drivers of flood risk across different climate zones and socio-economic development levels.
Objective
- To apply machine learning at the global level to assess the risk of major flooding and comprehensively identify its major contributors, considering variations across climate zones and levels of economic development.
Study Configuration
- Spatial Scale: Global, with analysis conducted at the politically defined provincial scale for the Flood Disaster Index (FDI) and at a 0.25° grid resolution for influencing factors.
- Temporal Scale: Flood event data from 1980 to 2023 (EM-DAT); hydro-meteorological data from 2010 to 2019; vegetation data from 1981 to 2015; other influencing factors are multi-year averages.
Methodology and Data
- Models used:
- Flood Disaster Index (FDI) calculation: Analytic Hierarchy Process (AHP) and Entropy Weighting (EW) for hazard metric weighting, combined linearly.
- Factor contribution analysis: Extreme Gradient Boosting (XGBoost) and Random Forest (RF) machine learning algorithms.
- Hyperparameter tuning: GridSearchCV with fivefold cross-validation.
- Consistency evaluation: Spearman rank correlation coefficient (ρ).
- Data sources:
- Flood events: Emergency Events Database (EM-DAT).
- Hydro-meteorological variables: Global Land Data Assimilation System (GLDASNOAH0253H 2.1) for precipitation (mm), runoff (mm), and surface soil moisture (m³/m³).
- Vegetation: Global Inventory Modelling and Mapping Studies (NDVI3g) for Normalized Difference Vegetation Index (NDVI).
- River network density: Data from Schneider et al. (2017).
- Topography: Shuttle Radar Topography Mission (SRTM) Global Digital Elevation Model (DEM), used to derive slope (degrees) and Terrain Roughness Index (TRI).
- Socio-economic level: LandScan global population dynamics dataset for population density (people/km²), Kummu et al. (2018) for Gross Domestic Product (GDP), and Global Roads Inventory Project (GRIP) for road density.
- Presence of dams: Global Georeferenced Database of Dams (Mulligan et al. 2020) for number of dams per grid cell.
- Zoning: Revised Köppen-Geiger climate classification for climate zones; Chen et al. (2022) for Global North/South socio-economic zones.
Main Results
- High-risk regions: China, South Asia, western Arabian Peninsula, western Germany, Java (Indonesia), Zulia (Venezuela), and eastern Australia were identified as particularly vulnerable to major flooding. China has seven provinces among the top 10 globally for FDI.
- FDI weighting: The linearly combined weighting factors for the eight hazard metrics (e.g., total deaths, total affected, economic losses, and their per-event ratios) were derived, with AHP results weighted more (α = 0.6092) than EW results (β = 0.3908).
- Global key influencing factors: Both XGBoost and RF identified river network density, GDP, DEM, precipitation (e.g., RAIN6), NDVI, number of dams, heavy rainfall days, surface soil moisture, and mean annual precipitation as key global factors, with high consistency (Spearman ρ = 0.7098).
- Climate-specific factors:
- Tropical regions: GDP (most important), mean annual precipitation, 12-hour and 24-hour maximum precipitation, surface soil moisture, and NDVI.
- Arid regions: NDVI (most important), GDP, 6-hour and 24-hour maximum precipitation, heavy rainfall days, surface soil moisture, and DEM.
- Temperate regions: Population, heavy rainfall days, road density, GDP, and TRI.
- Cold regions: 6-hour maximum precipitation, surface soil moisture, DEM, mean annual precipitation, and NDVI.
- Polar regions: DEM (most important), river network density, heavy rainfall days, per capita GDP, and NDVI.
- Socio-economic specific factors:
- Global North: 6-hour maximum precipitation was the most important factor.
- Global South: GDP was the most important factor.
- Both regions also showed importance for NDVI, DEM, heavy rainfall days, and mean annual precipitation.
- Model consistency: Spearman rank correlation coefficients between XGBoost and RF rankings were consistently positive and significant (p < 0.01) across all zones, confirming robust results.
Contributions
- Constructed a comprehensive Flood Disaster Index (FDI) at the provincial scale using four decades of global historical flood data from EM-DAT, achieving a finer spatial resolution than previous studies.
- Established a balanced framework for flood hazard metric weighting by integrating both subjective (AHP) and objective (EW) methods, enhancing the robustness of risk quantification.
- Employed two machine learning algorithms (XGBoost and RF) in parallel to identify the relative contributions of multiple natural and socio-economic factors, improving the interpretability and reliability of the findings.
- Provided novel, region-specific insights by analyzing how key flood risk factors differ across climatic zones and levels of socio-economic development, informing more targeted disaster prevention and mitigation strategies globally.
Funding
- National Key R&D Program of China (Grant No. 2023YFC3006701)
Citation
@article{Yu2026Regionspecific,
author = {Yu, Yang and Zhu, Wen and Zhu, Qiuan and Jin, Jiaxin and Jiang, Shanhu and Yuan, Shanshui and Yang, Xiaoli and Zhang, Xiaoxiang and Ren, Liliang and Fang, Xiuqin},
title = {Region-specific assessment of flood disaster risk and contributing factors, based on historical data and machine learning},
journal = {Natural Hazards},
year = {2026},
doi = {10.1007/s11069-025-07827-7},
url = {https://doi.org/10.1007/s11069-025-07827-7}
}
Original Source: https://doi.org/10.1007/s11069-025-07827-7