Qaraghuli et al. (2026) New multivariate composite remote sensing drought index based on machine learning and geospatial techniques, insights from Northern Iraq
Identification
- Journal: Journal of Hydrology Regional Studies
- Year: 2026
- Date: 2026-02-10
- Authors: Khalid Qaraghuli, Mohamad Fared Murshed, Md Azlin Md Said, Ali Salem, Ali Mokhtar
- DOI: 10.1016/j.ejrh.2026.103211
Research Groups
- School of Civil Engineering, Universiti Sains Malaysia, Nibong Tebal, Pulau Pinang, Malaysia
- Al-Mussaib Technical Institute, Al-Furat Al-Awsat Technical University, Babil, Iraq
- Department of Agricultural Engineering, Faculty of Agriculture, Cairo University, Giza, Egypt
- Civil Engineering Department, Faculty of Engineering, Minia University, Minia, Egypt
- Structural Diagnostics and Analysis Research Group, Faculty of Engineering and Information Technology, University of P´ecs, Pecs, Hungary
Short Summary
This study developed and evaluated five machine learning models for predicting the Standardized Precipitation Evapotranspiration Index (SPEI) at 3-month and 6-month timescales in Northern Iraq using satellite-based and gridded data. Random Forest (RF) and Extreme Gradient Boosting (XGB) consistently outperformed other models, revealing precipitation as the dominant driver for short-term droughts, while temperature, vegetation indices, and soil moisture were more influential for medium-term droughts.
Objective
- To develop and compare the performance of five state-of-the-art machine learning models for predicting SPEI at 3-month and 6-month timescales in Iraq.
- To evaluate the predictive power of different variable groups—including meteorological, land surface, and topographic data—through a series of structured scenarios.
- To identify the optimal model and input combination that provides the highest accuracy for drought prediction, establishing a foundation for an operational drought monitoring and forecasting system in the region.
Study Configuration
- Spatial Scale: Nineveh governorate, Northern Iraq, covering an area of 32,308 square kilometers. Data were resampled to a common spatial resolution of 1000 meters.
- Temporal Scale: Data covering 2001–2023 for model training and testing. SPEI was computed at 3-month (SPEI03) and 6-month (SPEI06) timescales, with all datasets aggregated to a monthly temporal scale.
Methodology and Data
- Models used: Random Forest (RF), Extreme Gradient Boosting (XGB), Support Vector Regression (SVR), Gradient Boosting Machine (GBM), Artificial Neural Networks (ANN). K-means clustering was used for regionalization. SHapley Additive exPlanations (SHAP) was applied for feature importance analysis.
- Data sources:
- Station Data: Monthly meteorological observations (1992–2013) from seven stations in Nineveh governorate, sourced from the Iraqi Meteorological Organization and Seismology (IMOS).
- Gridded Datasets: TerraClimate (monthly, approximately 4 km resolution, 1958–present) for precipitation, maximum temperature, minimum temperature, wind speed, and potential evapotranspiration (PET). CHIRPS (0.05° resolution) for precipitation. GLDAS-Noah (0.25° resolution) for soil moisture.
- Remote Sensing Data: MODIS (MOD13A3 for NDVI and EVI, MOD11A2 for Land Surface Temperature (LST) – 1000 m resolution, monthly/8-day, 2000–present). USGS SRTM DEM (30 m resolution, static) for elevation.
- Derived Indices: Soil Moisture Condition Index (SMCI), Temperature Condition Index (TCI), Vegetation Condition Index (VCI), Vegetation Health Index (VHI), Precipitation Condition Index (PCI), Rainfall Anomaly Index (RAI), Slope (SLP), and Roughness (Rghn).
- Bias Correction: Empirical Quantile Mapping (EQM) was applied to gridded precipitation and PET datasets using observed station data.
Main Results
- Random Forest (RF) and Extreme Gradient Boosting (XGB) models consistently outperformed Artificial Neural Networks (ANN), Support Vector Regression (SVR), and Gradient Boosting Machine (GBM) across all four predictor scenarios for both SPEI03 and SPEI06.
- In Scenario 01 (all 17 predictors), XGB achieved the highest accuracy for SPEI03 (R² = 0.905, NSE = 0.896, RMSE = 0.304) and SPEI06 (R² = 0.909, NSE = 0.899, RMSE = 0.298). RF showed very similar performance.
- In Scenario 02 (six meteorological predictors), RF maintained strong performance for SPEI03 (R² = 0.880, NSE = 0.875, RMSE = 0.328) and SPEI06 (R² = 0.866, NSE = 0.860, RMSE = 0.348), highlighting the dominant role of meteorological drivers.
- Scenarios 03 (soil and vegetation only) and 04 (six diverse variables) showed reduced but still effective performance for RF and XGB, demonstrating their adaptability to varied input sets.
- SHAP analysis revealed distinct regional variations in predictor influence:
- For short-term drought (SPEI03), precipitation (Pre) was consistently the most influential variable across all regions. Other significant factors included Rainfall Anomaly Index (RAI), Digital Elevation Model (DEM), maximum temperature (Tmax), Land Surface Temperature (LST), and Temperature Condition Index (TCI).
- For medium-term drought (SPEI06), the influence shifted from precipitation dominance to a complex interaction of Land Surface Temperature (LST), vegetation indices (NDVI, EVI, VHI), and soil moisture (SM), particularly in the northern and southern regions.
Contributions
- Developed and validated advanced machine learning models (RF, XGB) for predicting SPEI at multiple timescales in Northern Iraq, a climate-vulnerable and data-scarce region.
- Systematically evaluated the impact of different predictor variable groups (meteorological, land surface, topographic) on drought prediction accuracy, identifying optimal input combinations.
- Provided novel hydrological insights into regional drought dynamics by identifying the primary drivers of short-term (precipitation-dominated) versus medium-term (temperature, vegetation, soil moisture-driven) droughts using explainable AI (SHAP analysis).
- Proposed a robust, data-driven modeling framework that can strengthen early warning systems and support proactive drought risk management for local stakeholders.
- Contributed to the transition from diagnostic SPEI calculation to a predictive framework using integrated remote sensing and machine learning techniques.
Funding
The authors acknowledge the supportive environment provided by the School of Civil Engineering at Universiti Sains Malaysia. Gratitude is also extended to the Iraqi Meteorological Organization and Seismology for supplying meteorological station data. Ali Salem contributed to funding acquisition.
Citation
@article{Qaraghuli2026New,
author = {Qaraghuli, Khalid and Murshed, Mohamad Fared and Said, Md Azlin Md and Salem, Ali and Mokhtar, Ali},
title = {New multivariate composite remote sensing drought index based on machine learning and geospatial techniques, insights from Northern Iraq},
journal = {Journal of Hydrology Regional Studies},
year = {2026},
doi = {10.1016/j.ejrh.2026.103211},
url = {https://doi.org/10.1016/j.ejrh.2026.103211}
}
Original Source: https://doi.org/10.1016/j.ejrh.2026.103211