Ali et al. (2025) Aquifer-specific flood forecasting using machine learning: A comparative analysis for three distinct sedimentary aquifers
Identification
- Journal: The Science of The Total Environment
- Year: 2025
- Date: 2025-10-30
- Authors: Ali J. Ali, Ashraf Ahmed
- DOI: 10.1016/j.scitotenv.2025.180756
Research Groups
- Department of Civil and Environmental Engineering, Brunel University London, Uxbridge, UB8 3PH, United Kingdom
Short Summary
This study comparatively analyzes four machine learning models (TFT, Informer, LSTM, XGBoost) for multi-horizon (1-4 days) flood forecasting across three distinct sedimentary aquifers (Limestone, Chalk, Greensand) in the Thames Basin, UK. The research reveals that model accuracy is highly dependent on aquifer-specific hydrogeological characteristics, with Limestone showing very high accuracy (R² = 0.98–0.99) and Greensand exhibiting poor predictability (R² ≤ 0).
Objective
- To ascertain how aquifer-specific variables affect the prediction reliability of four machine learning models (Temporal Fusion Transformer, Informer, Long Short-Term Memory, and XGBoost) for multi-horizon (1-4 days) flood forecasting in the Thames Basin, UK.
- To enhance flood forecasting and risk management in the Thames Basin by combining hydrological records of rainfall, groundwater levels, and river stages with advanced machine learning algorithms.
Study Configuration
- Spatial Scale: Thames Basin, UK (area exceeding 16,200 km²), focusing on three distinct sedimentary aquifer types: Chalk, Limestone, and Greensand. Three monitoring sites were selected within each aquifer type.
- Temporal Scale: Data collected from April 2011 to early 2025. Hourly observations were aggregated to daily averages. Multi-horizon flood forecasting was performed for lead times of 1 to 4 days.
Methodology and Data
- Models used: Temporal Fusion Transformer (TFT), Informer, Long Short-Term Memory (LSTM), Extreme Gradient Boosting (XGBoost).
- Data sources: Environment Agency's Hydrological Data Explorer and local weather stations. Data types included rainfall, groundwater levels (GWL), and river levels.
- Data preprocessing: Hourly observations were aggregated to daily averages. Missing data were filled using linear interpolation. All variables were normalized to a 0-1 scale using the MinMaxScaler method. Lagged features (1-3 days) and 3-day rolling averages for rainfall, river level, and GWL were created.
- Validation: A strict holdout validation methodology was used, with 85% of the dataset for training/validation and 15% as a chronologically split holdout test set.
Main Results
- Model performance varied significantly based on aquifer type and forecasting horizon.
- Limestone Aquifer: Demonstrated very high prediction accuracy across all models and horizons, with R² values consistently between 0.98 and 0.99. TFT and Informer showed marginally better performance at shorter horizons (SMAPE: 1.53–1.64 %). This high accuracy is attributed to rapid and distinct groundwater-river interactions (correlation coefficient, r = 0.84).
- Chalk Aquifer: Showed moderate prediction accuracy. For a 1-day horizon, R² values ranged from 0.77 to 0.80, decreasing to 0.48–0.62 for a 4-day horizon. LSTM and Informer performed slightly better at shorter horizons, while TFT maintained more stable performance over longer horizons. Groundwater-river interaction showed a moderate correlation (r = 0.26).
- Greensand Aquifer: Exhibited poor forecasting ability across all models and horizons, with R² values often low or negative (R² ≤ 0), particularly for horizons longer than two days. This is due to delayed, complex, and diffuse groundwater-river interactions, and a weak negative association between river levels and groundwater (r = -0.14).
- Transformer-based models (TFT and Informer) generally outperformed XGBoost, especially in aquifers with quick groundwater-river reactions (e.g., Limestone). LSTM also proved to be a reliable sequential baseline.
- Prediction accuracy consistently decreased with increasing forecasting horizons (1-4 days) across all aquifers and models.
- RMSE and MAE values, reported on the normalized 0-1 scale, reflected these trends (e.g., Limestone RMSE: 0.02–0.04; Greensand RMSE: 0.06–0.07).
Contributions
- Presents the first aquifer-specific, multi-horizon comparative analysis of flood forecasting in the Thames Basin using advanced machine learning.
- Explicitly demonstrates that geological variability across different aquifer types (Chalk, Limestone, Greensand) significantly impacts the reliability of flood forecasts, challenging assumptions of hydrological homogeneity in previous studies.
- Provides the first evidence that transformer-based models can detect consistent differences in predictability driven by subsurface controls.
- Introduces a novel paradigm for flood forecasting that integrates both data-driven approaches and hydrogeological information.
- Offers a more physically consistent early warning method by combining groundwater level data with advanced transformer architectures.
- Highlights the critical influence of subsurface hydrology on prediction reliability, revealing aquifer-specific geological limitations in ML-based forecasting, with direct implications for resilience planning and flood risk management at the watershed scale.
Funding
- UKRI project 10063665
Citation
@article{Ali2025Aquiferspecific,
author = {Ali, Ali J. and Ahmed, Ashraf},
title = {Aquifer-specific flood forecasting using machine learning: A comparative analysis for three distinct sedimentary aquifers},
journal = {The Science of The Total Environment},
year = {2025},
doi = {10.1016/j.scitotenv.2025.180756},
url = {https://doi.org/10.1016/j.scitotenv.2025.180756}
}
Original Source: https://doi.org/10.1016/j.scitotenv.2025.180756