Al-Rawas et al. (2026) Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling
Identification
- Journal: Water
- Year: 2026
- Date: 2026-01-12
- Authors: Ghazi Al-Rawas, Mohammad Reza Nikoo, Nasim Sadra, Malik Al-Wardy
- DOI: 10.3390/w18020192
Research Groups
- Department of Civil and Architectural Engineering, Sultan Qaboos University, Muscat, Oman
- School of Mathematical and Computational Sciences, Massey University, Palmerston North, New Zealand
- Center for Environmental Studies and Research, Sultan Qaboos University, Muscat, Oman
Short Summary
This study introduces an LSTM model with a customized loss function and wavelet decomposition to accurately predict extreme rainfall events in Oman's Al-Batina region, demonstrating superior performance over traditional models and quantifying predictive uncertainty using a Bayesian MCMC framework.
Objective
- To develop and evaluate a custom extreme loss function specifically designed for LSTM networks to improve accuracy in predicting extreme rainfall events in Oman’s Al-Batina region, without sacrificing overall model performance, and to quantify predictive uncertainty using a Bayesian MCMC framework.
Study Configuration
- Spatial Scale: Al-Batinah area, northeastern coast of Oman, approximately 7979 km².
- Temporal Scale: Data from 2006 to 2020, aggregated to monthly resolution. Lagged features at 1, 3, and 7 months/days; rolling averages of 7- and 30-day intervals.
Methodology and Data
- Models used:
- Primary: Custom-built two-layer Long Short-Term Memory (LSTM) network with a weighted quantile loss function.
- Baseline: Random Forest (RF), Support Vector Machine (SVM), Artificial Neural Network (ANN).
- Hybrid: LSTM-RF ensemble.
- Uncertainty Quantification: Metropolis–Hastings Markov Chain Monte Carlo (MCMC).
- Feature Engineering: Discrete Wavelet Transform (DWT) using Daubechies-4 (db4) wavelet.
- Data sources:
- Precipitation (Rainfall): In situ measurements from the Ministry of Regional Municipalities and Water Resources.
- Land Surface Temperature (LST): MODIS/Terra MOD11A2 (NASA LP DAAC).
- Vegetation Index (NDVI): MODIS/Terra MOD13A1 (NASA LP DAAC).
- Elevation (DEM) and Derived Topography: NASADEM (NASA JPL).
- Soil properties (sand, silt, clay fractions, bulk density, gravel content): Derived from FAO-HWSD, Saxton et al. (1986), Wieder et al. (2014), and SoilGrids dataset (ISRIC).
- Proximity metrics: Distance to infrastructure (e.g., nearest road distance).
- Derived variables: Water Stress Index (WSI), lagged features, rolling means.
Main Results
- The custom loss LSTM model achieved the best performance: Mean Absolute Error (MAE) = 0.0222 mm/day, Root Mean Squared Error (RMSE) = 0.1098 mm/day, Coefficient of Determination (R²) = 0.8068, and Symmetric Mean Absolute Percentage Error (SMAPE) = 7.62%.
- The LSTM + RF ensemble model also performed well (MAE = 0.0275 mm/day, RMSE = 0.1125 mm/day, R² = 0.7971, SMAPE = 13.65%).
- A Kruskal–Wallis H-test confirmed a highly significant difference in prediction errors among models (H = 14,021.12, p < 0.001), indicating the custom LSTM's superior performance.
- The custom weighted loss function significantly outperformed standard loss functions (MSE, MAE, Quantile Loss q=0.9) in forecasting rainfall, particularly for extreme events.
- Bayesian MCMC framework quantified predictive uncertainty, yielding a posterior mean for the standard deviation parameter (σ) of 0.118 with a 95% credible interval of [0.117, 0.120].
- Sensitivity analysis identified rainfall and Land Surface Temperature (LST) as the most influential predictors, while NDVI had a comparatively lower impact.
- Wavelet decomposition provided multi-scale insights, revealing high-frequency variations (short-term disturbances) and long-term trends (e.g., constant decrease in water stress, changing rainfall patterns).
Contributions
- Development and evaluation of a novel custom extreme loss function for LSTM networks, specifically designed to prioritize accurate prediction of rare, high-impact extreme rainfall events in arid/semi-arid regions.
- Integration of multi-scale feature extraction using wavelet decomposition with dynamic (rainfall, LST, NDVI) and static (soil type, topography, infrastructure proximity) environmental variables.
- Quantification of predictive uncertainty using a Bayesian MCMC framework, providing probabilistic outputs and credible intervals for improved decision-making in flood risk management.
- Demonstrated superior performance of the custom LSTM model compared to traditional machine learning algorithms (RF, SVM, ANN) and an LSTM-RF ensemble, particularly in capturing extreme rainfall events.
Funding
- Sultan Qaboos University (SQU)
- Diwan of Royal Court
- His Majesty’s (HM) grant number SR/DVC/CESR/22/01
Citation
@article{AlRawas2026Leveraging,
author = {Al-Rawas, Ghazi and Nikoo, Mohammad Reza and Sadra, Nasim and Al-Wardy, Malik},
title = {Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling},
journal = {Water},
year = {2026},
doi = {10.3390/w18020192},
url = {https://doi.org/10.3390/w18020192}
}
Original Source: https://doi.org/10.3390/w18020192