Fatima et al. (2025) Machine Learning–Based Bias Correction of Model-Simulated Soil Moisture Using In-situ AWS Observations Over India
Identification
- Journal: Earth Systems and Environment
- Year: 2025
- Date: 2025-12-08
- Authors: Hashmi Fatima, S. Dhivagar, V. S. Prasad, Jaya Singh
- DOI: 10.1007/s41748-025-00964-w
Research Groups
- Ministry of Earth Sciences, National Centre for Medium Range Weather Forecasting, Gautam Buddha Nagar, Noida, Uttar Pradesh, India
- Nehru Memorial College, Putthanampatti, Tiruchirappalli, Tamilnadu, India
- Atmospheric Science Research Center, University of Albany, NY, USA
Short Summary
This study evaluates statistical and machine learning methods for bias correction of model-simulated soil moisture over India using in-situ observations, finding that machine learning (specifically XGBoost) significantly improves accuracy and correlation across all soil layers.
Objective
- To develop and evaluate machine learning-based bias correction techniques for model-simulated soil moisture over the Indian region, aiming to create an accurate, bias-corrected dataset for water management and predicting severe weather events.
Study Configuration
- Spatial Scale: Indian subcontinent, covering approximately 200 Automatic Weather Station (AWS) locations.
- Temporal Scale: AWS observations and model analysis from June 2022 to December 2024.
Methodology and Data
- Models used:
- Land Surface Model: JULES (Joint UK Land Environment Simulator)
- Numerical Weather Prediction (NWP) System: NCMRWF Unified Model (NCUM)
- Bias Correction Algorithms: Quantile Mapping (QM), Random Forest (RF), XGBoost (Extreme Gradient Boosting)
- Data sources:
- In-situ observations: Daily soil moisture data from approximately 200 Automated Weather Stations (AWS) installed by the India Meteorological Department (IMD), providing measurements for four soil layers (0–0.1 m, 0.1–0.35 m, 0.35–1 m, 1–3 m).
- Model data: IMDAA-like products derived from the operational NCMRWF Unified Model (NCUM) global NWP system, with a 12 km spatial resolution, providing hourly soil moisture data for four soil layers.
Main Results
- Initial model-simulated soil moisture products consistently underpredicted seasonal variability and showed very weak to negative correlations with AWS observations across all soil layers (e.g., Layer 1: -0.0144, Layer 4: -0.1110).
- The traditional statistical method, Quantile Mapping (QM), showed unsatisfactory performance, with negligible improvement in correlation coefficients and a negative R² score, indicating an inadequate fit to the data.
- Machine learning methods significantly improved the correlation and reduced biases:
- Random Forest (RF) improved correlation coefficients (e.g., Layer 1: 0.4604, Layer 4: 0.7794) but sometimes exhibited over-concentration or overfitting.
- XGBoost (XGB) demonstrated superior performance, substantially reducing both positive and negative biases and significantly enhancing correlation coefficients for all soil layers:
- Layer 1 (0–0.1 m): from -0.0144 to 0.7552
- Layer 2 (0.1–0.35 m): from 0.1071 to 0.7919
- Layer 3 (0.35–1 m): from -0.0053 to 0.8369
- Layer 4 (1–3 m): from -0.1110 to 0.9019
- XGBoost achieved the lowest Root Mean Square Error (RMSE), Unbiased Root Mean Square Error (UBRMSE), and Mean Absolute Error (MAE), along with the highest Coefficient of Determination (R²) score among all tested methods.
- Heatmaps of soil moisture bias after XGBoost correction showed a significant reduction in bias ranges across all soil layers (e.g., Layer 1 bias reduced from a range of -0.5 to 0.3 to -0.2 to 0.2).
Contributions
- This study is among the first to apply advanced machine learning techniques for bias correction of IMDAA-like model-simulated soil moisture products over India using in-situ AWS observations.
- It quantitatively demonstrates the superior effectiveness of ensemble machine learning algorithms (specifically XGBoost) over traditional statistical bias correction methods (Quantile Mapping) for improving soil moisture data accuracy.
- Provides a robust, bias-corrected soil moisture dataset that can enhance hydrological modeling, agricultural planning, and disaster preparedness for extreme weather events in the Indian region.
- Highlights the significant potential of machine learning to advance hydrological applications and improve water resource management.
Funding
- National Monsoon Mission (NMM) project under the Ministry of Earth Sciences, Government of India.
Citation
@article{Fatima2025Machine,
author = {Fatima, Hashmi and Dhivagar, S. and Prasad, V. S. and Singh, Jaya},
title = {Machine Learning–Based Bias Correction of Model-Simulated Soil Moisture Using In-situ AWS Observations Over India},
journal = {Earth Systems and Environment},
year = {2025},
doi = {10.1007/s41748-025-00964-w},
url = {https://doi.org/10.1007/s41748-025-00964-w}
}
Original Source: https://doi.org/10.1007/s41748-025-00964-w