Tandon et al. (2026) Integrating IMDAA Regional Reanalysis and Machine Learning for Enhanced Detection of Extreme Precipitation Over Complex Himalayan Terrain
Identification
- Journal: Earth Systems and Environment
- Year: 2026
- Date: 2026-04-10
- Authors: Aayushi Tandon, Kanhu Charan Pattnayak, Amit Awasthi
- DOI: 10.1007/s41748-026-01117-3
Research Groups
- Department of Applied Sciences, School of Advanced Engineering, UPES, Dehradun, India
- Hadley Centre for Climate, Met Office, Exeter, UK
Short Summary
This study integrates high-resolution IMDAA reanalysis with machine learning to enhance extreme precipitation detection over the complex Himalayan terrain, demonstrating that Random Forest significantly outperforms Support Vector Machines in accuracy and precision for extreme events, thus establishing a reliable diagnostic framework.
Objective
- To characterize the spatiotemporal evolution of precipitation extremes across North India's diverse topography and rigorously evaluate the efficacy of machine learning classifiers (Random Forest and Support Vector Machine) in detecting high-impact weather events, particularly mesoscale convective extremes.
Study Configuration
- Spatial Scale: North India, encompassing the Himalayan Orogen (Jammu & Kashmir, Himachal Pradesh, Uttarakhand) and the Indo-Gangetic Alluvial Plains (Punjab, Haryana, Rajasthan, Uttar Pradesh, Delhi). Data processed on a 0.12° × 0.12° (~12 km) grid across approximately 6,500 spatial points.
- Temporal Scale: 1979–2022 (44 years) for daily accumulated rainfall and meteorological predictors.
Methodology and Data
- Models used:
- Supervised Machine Learning Classifiers: Random Forest (RF) and Support Vector Machine (SVM) with Radial Basis Function (RBF) kernel.
- Statistical Trend Analysis: Non-parametric Mann-Kendall (MK) test and Sen’s Slope estimator.
- Data sources:
- Indian Monsoon Data Assimilation and Analysis (IMDAA) reanalysis dataset (0.12° × 0.12° resolution).
- Meteorological predictors: wind speed, maximum temperature, minimum temperature, pressure, relative humidity, and solar radiation.
- Target variable: Daily accumulated rainfall, categorized into 'Dry' (≤ 10th percentile), 'Normal' (10th–95th percentile), and 'Extreme' (> 95th percentile) based on local climatology.
Main Results
- An increasing annual precipitation trend of +3.0 mm/decade (p < 0.05) was quantified across the Himalayan foothills.
- The Random Forest (RF) classifier demonstrated superior overall accuracy of 81.6% (mean spatial accuracy) compared to the Support Vector Machine (SVM) at 80.9%.
- RF exhibited superior spatial robustness, achieving high-reliability coverage (accuracy > 0.85) across 74.4% of the domain (4,866 locations), significantly more than SVM's 58.8% (3,847 locations).
- For 'Extreme' precipitation events, RF achieved a precision of 0.80 and perfect specificity (1.000), effectively reducing false-alarm rates. In contrast, SVM completely failed to detect 'Extreme' events (Precision, Recall, and F1-Score of 0.000).
- RF's recall for 'Extreme' events was low at 0.077, indicating missed events despite high precision.
- Decadal analysis showed a rapid intensification of the hydrological cycle, with the mean annual precipitation increasing from 111 mm in the 1980s to 130 mm in the 2020s.
- Feature importance analysis for RF identified Minimum Temperature (22.8%) and Relative Humidity (20.3%) as the dominant drivers of precipitation variability.
Contributions
- Provides a novel application of high-resolution (12 km) IMDAA reanalysis data combined with advanced machine learning (RF and SVM) to capture mesoscale extreme precipitation events across approximately 6,500 spatial points in North India.
- Offers a direct comparative evaluation of ensemble-based (RF) versus geometric (SVM) machine learning classifiers, specifically addressing the challenge of class imbalance in extreme weather datasets within a complex, high-elevation, and data-sparse context.
- Establishes the superiority of Random Forest for meteorological applications in complex terrain, offering a reliable and cost-effective diagnostic framework for operational forecasting and disaster mitigation strategies in North India.
- Quantifies significant spatiotemporal trends and variability in precipitation extremes, providing evidence-based insights for agricultural resilience, infrastructure planning, and data-driven water resource governance.
Funding
- University of Petroleum and Energy Studies (provided research resources)
- NCMRWF (acknowledged for IMDAA dataset creation)
Citation
@article{Tandon2026Integrating,
author = {Tandon, Aayushi and Pattnayak, Kanhu Charan and Awasthi, Amit},
title = {Integrating IMDAA Regional Reanalysis and Machine Learning for Enhanced Detection of Extreme Precipitation Over Complex Himalayan Terrain},
journal = {Earth Systems and Environment},
year = {2026},
doi = {10.1007/s41748-026-01117-3},
url = {https://doi.org/10.1007/s41748-026-01117-3}
}
Original Source: https://doi.org/10.1007/s41748-026-01117-3