Singh et al. (2025) From bias to forecast: advancing satellite rainfall accuracy and flood prediction with transformer modeling in the Kosi basin (India)
Identification
- Journal: Stochastic Environmental Research and Risk Assessment
- Year: 2025
- Date: 2025-09-22
- Authors: Aditya Kumar Singh, V. P. Singh, Ajit Kumar, Thendiyath Roshni
- DOI: 10.1007/s00477-025-03100-2
Research Groups
- Department of Civil Engineering, National Institute of Technology Patna, Bihar, India
- Department of Energy, Systems, Territory and Construction Engineering, University of Pisa, Pisa, Italy
Short Summary
This study enhances satellite rainfall product (SRP) accuracy through Random Forest bias correction and integrates the best-performing SRP (IMERG) into a Transformer model for real-time flood forecasting in the Kosi River basin, India, achieving robust water level predictions up to 14 days in advance.
Objective
- To reduce bias in Satellite Rainfall Products (SRPs) using the Random Forest approach.
- To evaluate the accuracy of bias-corrected SRPs (IMERG, PERSIANN, MERRA-2) by comparing them with India Meteorological Department (IMD) ground-based rain gauge data at daily, monthly, seasonal, and annual time scales.
- To perform real-time flood forecasting using the best-performing SRP as rainfall input in a Transformer-based model.
Study Configuration
- Spatial Scale: Kosi River basin, India (between latitudes 25°20′-26°48′ N and longitudes 86°20′-87°41′ E). Satellite rainfall products have resolutions of 0.10° × 0.10° (IMERG), 0.50° × 0.625° (MERRA-2), and 0.25° × 0.25° (PERSIANN).
- Temporal Scale: 2001 to 2021 (21 years) for rainfall and water level data. Analysis conducted at daily, monthly, seasonal, and annual time scales. Flood forecasting performed for lead times of 1, 3, 5, 7, 10, and 14 days.
Methodology and Data
- Models used:
- Random Forest (RF) algorithm for bias correction of SRPs.
- Transformer model (encoder architecture) for real-time flood forecasting.
- Data sources:
- Satellite Rainfall Products (SRPs): Integrated Multi-Satellite Retrievals for GPM (IMERG), Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN), and Modern-Era Retrospective Analysis for Research and Applications Version 2 (MERRA-2).
- Observation data: Rainfall data from eight ground-based rain gauge stations from the India Meteorological Department (IMD), Pune. Daily water level records at the Baltara gauging station from the Water Resources Division, Patna.
Main Results
- The Random Forest algorithm substantially reduced bias in SRPs, improving their alignment with observed rain gauge data across the Kosi basin.
- Among the bias-corrected SRPs, IMERG consistently outperformed MERRA-2 and PERSIANN across most statistical metrics (KGE, NSE, R², RMSE, POD, CSI) and temporal scales (daily, monthly, seasonal, annual).
- At the daily scale, IMERG showed the highest Probability of Detection (86.76%) and strong Critical Success Index (79.88%).
- At the monthly scale, IMERG had the highest R² (0.89), KGE (0.67), and lowest RMSE (44.40 mm).
- IMERG demonstrated superior accuracy in classifying extreme and moderate precipitation events using SPI and MCZI.
- IMERG showed the highest correlation with observed annual rainfall (r = 0.94).
- The Transformer model, using bias-corrected IMERG rainfall data and antecedent water levels, achieved superior performance for real-time water level prediction.
- Highest testing accuracy was observed at 1-day lead time (R = 0.98, NSE = 0.96).
- Robust accuracy was maintained even at 14-day lead time (R = 0.91, NSE = 0.82).
Contributions
- This study provides a novel approach for flood forecasting in data-scarce, flood-prone regions by rigorously assessing and bias-correcting satellite rainfall products and integrating them into an advanced machine learning framework.
- It demonstrates the practical value of SRPs as a reliable alternative to ground-based observations for real-time hydrological forecasting, especially in developing countries.
- The research highlights the effectiveness of combining Random Forest for bias correction and the Transformer architecture for long-lead flood forecasting, offering a scalable and efficient solution for disaster preparedness and resource allocation.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Citation
@article{Singh2025From,
author = {Singh, Aditya Kumar and Singh, V. P. and Kumar, Ajit and Roshni, Thendiyath},
title = {From bias to forecast: advancing satellite rainfall accuracy and flood prediction with transformer modeling in the Kosi basin (India)},
journal = {Stochastic Environmental Research and Risk Assessment},
year = {2025},
doi = {10.1007/s00477-025-03100-2},
url = {https://doi.org/10.1007/s00477-025-03100-2}
}
Original Source: https://doi.org/10.1007/s00477-025-03100-2