Ganiyu et al. (2026) Enhancing flood simulation in data-sparse Niger central hydrological area river basin in Nigeria using machine learning-based data fusion
Identification
- Journal: Theoretical and Applied Climatology
- Year: 2026
- Date: 2026-03-12
- Authors: Habeeb Oladimeji Ganiyu, Wan Zurina Wan Jaafar, Faridah Othman, Cia Yik Ng
- DOI: 10.1007/s00704-026-06091-4
Research Groups
- Department of Civil Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Malaysia
- Department of Civil Engineering, Faculty of Engineering and Technology, Kwara State University, Malete, Nigeria
Short Summary
This study enhances flood event simulation in the data-sparse Niger Central Hydrological Area River Basin in Nigeria by fusing daily downscaled PERSIANN-CDR satellite precipitation with observed rainfall data using machine learning models. The Random Forest (RF) model demonstrated superior accuracy in data fusion, significantly improving precipitation estimates and subsequently leading to more reliable flood simulations with the HEC-HMS hydrological model.
Objective
- To evaluate the fusion of daily downscaled PERSIANN-CDR and observed rainfall data using three machine learning models (Random Forest, Extreme Gradient Boosting, and Long Short-Term Memory) across five gauging stations between 2013 and 2022.
- To validate the effectiveness of the enhanced satellite precipitation product by applying the fused PERSIANN-CDR from the best-performing fusion model, alongside individual datasets, as inputs to the HEC-HMS model for simulating three extreme flood events.
Study Configuration
- Spatial Scale: Niger Central Hydrological Area River Basin, Nigeria, covering an area of 47,310 square kilometers, located between 7°39’E and 9°52’E longitude and 4°36’N and 7°35’N latitude.
- Temporal Scale: Daily data from 2013 to 2022 (10 years) for rainfall and discharge, with specific flood events used for calibration and validation.
Methodology and Data
- Models used:
- Machine Learning: Extreme Gradient Boosting (XGB), Random Forest (RF), Long Short-Term Memory (LSTM) for data fusion.
- Hydrological: HEC-HMS (version 4.11) for rainfall-runoff simulation, employing SCS curve number, SCS unit hydrograph, Recession, and Muskingum methods.
- Ancillary Tools: HEC-GeoHMS and ArcHydro tools (ArcGIS extensions) for basin delineation and parameter generation; RHtests_dlyPrcp for rainfall homogeneity check.
- Data sources:
- Satellite Precipitation: PERSIANN-CDR (0.25° spatial resolution, downscaled to 0.1°), obtained from NOAA.
- Observation Data: Daily observed rainfall data (2013–2022) from five gauging stations (Abuja, Baro, Bida, Lokoja, Minna) provided by the Nigerian Meteorological Agency (NIMET). Daily discharge and water level data (2013–2022) for the Lokoja gauge station from the Nigerian Inland Waterways (NIWA).
- Geospatial Data: ALOS PALSAR Digital Elevation Model (DEM) at 12.5 m spatial resolution; FAO-UNESCO soil map at 10 km resolution; Sentinel-2 Land Use Land Cover (LULC) map (2022 version) at 10 m resolution.
Main Results
- The Random Forest (RF) model outperformed Extreme Gradient Boosting (XGB) and Long Short-Term Memory (LSTM) in fusing downscaled PERSIANN-CDR with observed rainfall data.
- RF-fused PERSIANN-CDR improved the mean correlation coefficient (R) of the downscaled PERSIANN-CDR by 113.48%.
- RF-fused PERSIANN-CDR reduced the mean Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Bias by 29.90%, 26.76%, and 75.71%, respectively, across the basin.
- Hydrological validation using HEC-HMS for extreme flood events (2018, 2019 for calibration; 2022 for validation) showed that RF-fused PERSIANN-CDR significantly improved simulated flow.
- During calibration, RF-fused PERSIANN-CDR achieved higher Nash–Sutcliffe Efficiency (NSE) values (0.75 and 0.60) and Coefficient of Determination (R²) values (0.87 and 0.64), and lower Ratio of RMSE to Standard Deviation (RSR) values (0.5 and 0.6) compared to individual datasets.
- During validation, the RF-fused data showed strong agreement with observed discharge, with R² of 0.72, NSE of 0.61, and RSR of 0.6.
- Uncertainty quantification using bootstrapped 95% confidence intervals and paired Wilcoxon tests (p < 0.05) confirmed the statistical significance and robustness of the improvements achieved by the RF-fused precipitation data.
Contributions
- Presents a novel and practical framework for enhancing satellite precipitation estimates through machine learning-based data fusion for improved flood simulations in data-sparse regions, specifically in Nigeria.
- First study in Nigeria to comprehensively evaluate the effectiveness of machine learning-fused satellite precipitation products (SPPs) for hydrological modeling of extreme flood events using RF, XGB, and LSTM models.
- Demonstrates the significant influence of enhanced SPPs on the accuracy of extreme flood simulations, providing a robust foundation for future spatial regionalization and effective flood risk mitigation strategies.
- Integrates rigorous uncertainty quantification using bootstrap median differences, 95% confidence intervals, and Wilcoxon tests, adding statistical robustness to the evaluation of model performance and the observed improvements.
Funding
- Tertiary Education Trust Fund (TETFUND), Nigeria (Overseas PhD Scholarship Scheme).
- Ministry of Higher Education, Malaysia (FRGS grant, Grant number: FRGS/1/2020/TK0/UM/02/19).
Citation
@article{Ganiyu2026Enhancing,
author = {Ganiyu, Habeeb Oladimeji and Jaafar, Wan Zurina Wan and Othman, Faridah and Ng, Cia Yik},
title = {Enhancing flood simulation in data-sparse Niger central hydrological area river basin in Nigeria using machine learning-based data fusion},
journal = {Theoretical and Applied Climatology},
year = {2026},
doi = {10.1007/s00704-026-06091-4},
url = {https://doi.org/10.1007/s00704-026-06091-4}
}
Original Source: https://doi.org/10.1007/s00704-026-06091-4