Yang et al. (2026) Machine learning-enhanced static flood models for high-resolution peak storm surge inundation mapping in Southeast Texas, USA

Identification

Journal: Journal of Hydrology Regional Studies
Year: 2026
Date: 2026-01-06
Authors: Hyunje Yang, Jun-Whan Lee, Armando Ulises Santos Cruz
DOI: 10.1016/j.ejrh.2025.103056

Research Groups

Maseeh Department of Civil, Architectural and Environmental Engineering, The University of Texas at Austin, Austin, TX, United States

Short Summary

This study proposes C1PK-Flood, a hybrid framework that enhances a static flood model with machine learning to provide rapid, high-resolution peak storm surge predictions. The model addresses limitations of existing approaches by improving accuracy and applicability in data-scarce regions while significantly reducing computational demands.

Objective

To develop a hybrid framework (C1PK-Flood) that uses machine learning to enhance the accuracy and applicability of static flood models for rapid, high-resolution peak storm surge inundation mapping, particularly in data-scarce coastal regions.

Study Configuration

Spatial Scale: Coastal floodplain surrounding Galveston, southeastern Texas, USA (approximately 95.14°W to 94.34°W and 29.05°N to 29.64°N). High-resolution inundation maps generated over an 8000 × 5900 spatial grid, covering approximately 19 million inland grid cells, with a resolution of approximately 11 meters.
Temporal Scale: Analysis of 446 synthetic Tropical Cyclone (TC) scenarios. The machine learning model uses an 11-hour time series window of TC parameters (4 hours before to 6 hours after landfall), recorded at 1-hour intervals.

Methodology and Data

Models used:
- Hybrid framework: C1PK-Flood
- Machine Learning component: C1PKNet (a one-dimensional Convolutional Neural Network (CNN) incorporating Principal Component Analysis (PCA) and K-means clustering (KMC)).
- Static flood model component: MatFlood (an efficient algorithm for mapping flood extent and depth, translated from MATLAB to Python).
- Reference numerical models for synthetic data generation: ADCIRC (storm surges), STWAVE (nearshore waves), WAM (offshore waves).
Data sources:
- Synthetic storm surge simulation dataset (Dawson et al., 2021) comprising 446 synthetic TC scenarios for the Texas coastline, developed by the U.S. Army Corps of Engineers (USACE) and Federal Emergency Management Agency (FEMA).
- Digital Elevation Models (DEMs) derived from an unstructured triangular grid (Dawson et al., 2021), converted to a structured grid for MatFlood.
- Time series of six TC parameters: central pressure (Cp), translation speed (Vf), heading direction (θ), radius of maximum winds (Rmax), latitude (LAT), and longitude (LON).

Main Results

The C1PK-Flood model generated high-resolution peak storm surge inundation maps (approximately 19 million grid cells) within 2–3 minutes using a single multi-core CPU.
It achieved a weighted mean Root Mean Square Error (RMSE) of 0.44 meters for peak storm surge and a mean F1 score of 0.94 for dry/wet classification across 10 test TCs.
Incorporating multiple synthetic observation points (1217 inland points) significantly improved the spatial accuracy of inundation maps compared to single-point static flood models, particularly in areas distant from the primary observation point.
Optimization of the input configuration (an 11-hour time series window) reduced the normalized RMSE by up to 10% compared to using longer, unoptimized time series inputs.
Time series inputs of TC parameters generally outperformed landfall-only inputs, especially for predicting extreme surge events (peak water levels exceeding 1.5 meters).
The model demonstrated comparable accuracy using less than half (e.g., 210 or 300) of the original training data (436 TCs), indicating the potential for efficient training data design.
Approximately 64% of the ML prediction error propagated to the final inundation RMSE, with an error-damping threshold of approximately 0.11 meters, suggesting robustness to moderate ML prediction errors.

Contributions

Developed a novel hybrid framework (C1PK-Flood) that integrates machine learning (C1PKNet) with a static flood model (MatFlood) to overcome the trade-offs between accuracy, computational efficiency, and spatial resolution in storm surge prediction.
Enhanced the applicability of static flood models to data-scarce regions by utilizing ML-predicted water levels at multiple synthetic observation points based on nationally available TC parameters, rather than relying on limited in-situ measurements.
Demonstrated significant computational efficiency, enabling the generation of high-resolution inundation maps in minutes on a single CPU, a substantial improvement over traditional physics-based models that require hours on high-performance computing clusters.
Systematically optimized the temporal window of TC parameter inputs for the ML model, leading to improved predictive performance.
Provided evidence that incorporating time series of TC parameters is more effective for storm surge prediction than relying on single-time-step (landfall-only) inputs, particularly for extreme events.
Showcased the potential for efficient training data design, achieving comparable model performance with a reduced number of synthetic TC scenarios.

Funding

Good Systems Grand Challenge at the University of Texas at Austin
University of Texas at Austin Startup Grant

Citation

@article{Yang2026Machine,
  author = {Yang, Hyunje and Lee, Jun-Whan and Cruz, Armando Ulises Santos},
  title = {Machine learning-enhanced static flood models for high-resolution peak storm surge inundation mapping in Southeast Texas, USA},
  journal = {Journal of Hydrology Regional Studies},
  year = {2026},
  doi = {10.1016/j.ejrh.2025.103056},
  url = {https://doi.org/10.1016/j.ejrh.2025.103056}
}

Original Source: https://doi.org/10.1016/j.ejrh.2025.103056