Tefera et al. (2025) Integrating machine learning models with ground sensors to enhance soil moisture prediction in agroecosystems of Texas
Identification
- Journal: Computers and Electronics in Agriculture
- Year: 2025
- Date: 2025-12-23
- Authors: Gebrekidan Worku Tefera, Ram L. Ray, Reggie Jackson, Bhagya Deegala, Oyomire Akenzua
- DOI: 10.1016/j.compag.2025.111358
Research Groups
- Cooperative Agricultural Research Center, College of Agriculture, Food and Natural Resources, Prairie View A&M University, Prairie View, TX 77446, United States
Short Summary
This study enhances soil moisture prediction in Texas agroecosystems by integrating in situ sensor data with biometeorological variables using various machine and deep learning models. It found that Random Forest, Extreme Gradient Boosting, and Long Short-Term Memory models achieved superior predictive accuracy (R² ≥ 0.90, RMSE ≤ 0.021 m³ m⁻³) with robust uncertainty quantification.
Objective
- To evaluate machine and deep learning models for half-hourly soil moisture prediction across crop and pasture agroecosystems in Southern Texas, utilizing flux-tower and soil moisture sensor data, with transparent hyperparameter tuning and robust uncertainty analysis.
- Hypotheses: 1) Machine and deep learning models' prediction performance differs between crop and pasture agroecosystems. 2) Tree-based models (e.g., Random Forest) outperform deep learning models by better handling multicollinearity among biometeorological predictors.
Study Configuration
- Spatial Scale: A 778-acre research farm at Prairie View A&M University, southeast Texas, USA, encompassing crop and pasture agroecosystems. Soil moisture sensors were installed across 42-72 subplots at depths of 0–15 cm, 16–30 cm, 31–60 cm, and 16–45 cm, with 0–15 cm data used for this study. Nine Eddy Covariance Flux Towers were distributed across the farm.
- Temporal Scale: Half-hourly (30-minute) intervals. Data covered August to December 2023 and July to November 2024 for crop agroecosystems, and August 2023 to March 2025 for pasture agroecosystems.
Methodology and Data
- Models used:
- Machine Learning: Random Forest (RF), Support Vector Regression (SVR), Artificial Neural Networks (ANN), Extreme Gradient Boosting (XGBoost).
- Deep Learning: Deep Neural Network (DNN), Long Short-Term Memory (LSTM).
- Data sources:
- In situ soil moisture: TEROS soil moisture sensors (volumetric water content at 0–15 cm depth).
- Biometeorological data (from Eddy Covariance Flux Towers): Evapotranspiration (mm), air temperature (°C), ecosystem respiration (RECO) (µmol m⁻² s⁻¹), precipitation (mm), air pressure (kPa), soil heat flux (W/m²), outgoing longwave radiation (MJ m⁻² day⁻¹), relative humidity (%), sensible heat flux (W/m²), latent heat flux (W/m²).
- Data preprocessing: 10% missing data threshold, Multivariate Imputation by Chained Equations (MICE) for imputation, outlier detection (values outside 0.0–0.6 m³ m⁻³).
- Validation: 10-fold cross-validation, spatial hold-out validation (training on plots 1-5, testing on plot 1-6 for crop; training on plots 2-5, testing on plot 2-6 for pasture), bootstrapping (2000 iterations) for uncertainty quantification.
- Feature importance analysis: Spearman correlation, Variance Inflation Factor (VIF) for multicollinearity, normalized Gini impurity, and SHAP (SHapley Additive exPlanations).
- Hyperparameter optimization: Exhaustive grid search.
Main Results
- Soil moisture exhibited clear seasonal variation, with lower levels in summer (August) and higher levels in winter (December), primarily driven by air temperature.
- Pasture agroecosystems consistently showed higher soil moisture than crop systems, attributed to protective vegetation cover and improved soil structure.
- Air temperature, ecosystem respiration (RECO), and soil heat flux were identified as the most influential predictors of soil moisture across both agroecosystems by Gini impurity and SHAP analyses. Rainfall showed relatively low importance, but lagged rainfall (4-hour, 1-2 day) had a greater influence.
- Random Forest (RF) and Extreme Gradient Boosting (XGBoost) models demonstrated superior predictive performance among machine learning models, achieving R² values ≥ 0.90 and RMSE ≤ 0.02 m³ m⁻³ for the crop agroecosystem.
- The Long Short-Term Memory (LSTM) deep learning model showed comparable performance to RF and XGBoost, with R² values of 0.91 for crop and 0.87 for pasture, and RMSE values of 0.024–0.025 m³ m⁻³ for crop.
- Artificial Neural Networks (ANN) exhibited the lowest predictive skill (R² = 0.83 for crop, 0.80 for pasture).
- Bootstrapping analysis confirmed the robustness of RF and LSTM models, showing narrower 95% confidence intervals for RMSE and R² compared to other models.
- Pairwise t-tests revealed statistically significant differences in prediction errors among most model pairs, except between RF and XGBoost in both agroecosystems, and between SVM and DNN, SVM and LSTM, and DNN and LSTM in the crop agroecosystem.
Contributions
- Integration of high-resolution (half-hourly) flux-tower biometeorological measurements with in situ soil moisture data for enhanced prediction.
- Direct comparative analysis of hydrometeorological dynamics and model performance across contrasting crop and pasture agroecosystems under similar climatic conditions.
- Enhanced interpretability of soil moisture drivers through the incorporation of SHAP, Gini relative importance, and multicollinearity analyses.
- A comprehensive and transparent comparative modeling framework for machine and deep learning models, including explicit hyperparameter reporting, rigorous 10-fold cross-validation, and robust uncertainty quantification via bootstrapping.
- Improved reproducibility, interpretability, and practical relevance of data-driven soil moisture prediction, with direct applications for irrigation scheduling, agricultural drought management, and sustainable water resource use in semi-arid agroecosystems.
Funding
- Shell International Exploration and Production Inc., USA.
Citation
@article{Tefera2025Integrating,
author = {Tefera, Gebrekidan Worku and Ray, Ram L. and Jackson, Reggie and Deegala, Bhagya and Akenzua, Oyomire},
title = {Integrating machine learning models with ground sensors to enhance soil moisture prediction in agroecosystems of Texas},
journal = {Computers and Electronics in Agriculture},
year = {2025},
doi = {10.1016/j.compag.2025.111358},
url = {https://doi.org/10.1016/j.compag.2025.111358}
}
Original Source: https://doi.org/10.1016/j.compag.2025.111358