Ochege et al. (2025) Enhancing reference crop evapotranspiration prediction in arid regions: A stacking ensemble learning approach for the Amu Darya basin
Identification
- Journal: Smart Agricultural Technology
- Year: 2025
- Date: 2025-10-21
- Authors: Friday Uchenna Ochege, Xiuliang Yuan, Geping Luo
- DOI: 10.1016/j.atech.2025.101554
Research Groups
- State Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, China
- Key Laboratory of GIS & RS Application, Xinjiang Uygur Autonomous Region, Urumqi, China
- Department of Geography and Environmental Management, University of Port Harcourt, Port Harcourt, Rivers State, Nigeria
Short Summary
This study developed a novel stacking ensemble (stkENS) machine learning model, hybridizing Decision Trees, Generalized Linear Models, K-Nearest Neighbours, and Support Vector Regression, to enhance reference crop evapotranspiration (ETo) prediction in the data-limited Amu Darya basin. The stkENS model significantly outperformed individual base learners, achieving high accuracy (R² > 0.96, RMSE: 0.65 mm d⁻¹) with fewer inputs, providing robust ETo estimates crucial for sustainable water management in arid croplands.
Objective
- To optimize cropland evapotranspiration (ET) prediction by exploring the feasibility and effectiveness of hybridizing strong and weak learners for accurate daily reference crop evapotranspiration (ETo) prediction in the data-limited Amu Darya basin (ADB).
- To design, construct, and optimize a hybrid predictive model using a stacking ensemble (stkENS) framework, ensuring its robustness and applicability in ADB.
- To systematically evaluate the performance of individual base learners and compare the learning accuracy of the derived stkENS against its constituent base learners across multiple prediction scenarios.
Study Configuration
- Spatial Scale: Amu Darya Irrigation Basin (ADB), Central Asia, spanning approximately 2400 km². The study focused on three specific crop fields (cotton, rice, sorghum) located at World Meteorological Organization (WMO) historical sites (ID No. 38,149, 38,262, and 38,392).
- Temporal Scale: Daily meteorological variables from 1983 to 2018. The dataset was split into a training set (1983-2013) and a testing set (2014-2018).
Methodology and Data
- Models used:
- Benchmark Model: FAO-56 Penman-Monteith (FAO56-PM) model.
- Base Learners: Decision Trees (DT), Generalized Linear Models (GLM), K-Nearest Neighbours (KNN), Support Vector Regression (SVR).
- Super Learner/Ensemble Model: Stacking Ensemble (stkENS) model, utilizing XGBoost as the meta-learner.
- Data sources: Daily meteorological variables, including maximum air temperature (Tmax), minimum air temperature (Tmin), wind speed at 2 meters (U2), solar radiation (Rn), Dew Point (DewP), and Vapor Pressure Deficit (VPD), were retrieved from historical records of three WMO sites. Data were obtained from the National Climate Data Center (NCDC).
Main Results
- The stkENS model demonstrated superior performance in daily ETo prediction, achieving an optimal coefficient of determination (R²) > 0.96, the lowest bias of 0.01 mm d⁻¹, a Root Mean Square Error (RMSE) of 0.65 mm d⁻¹, and a Mean Absolute Error (MAE) of 0.42 mm d⁻¹.
- stkENS significantly outperformed individual base learners: DT (R²: 0.73, RMSE: 1.12 mm d⁻¹, MAE: 0.86 mm d⁻¹), GLM (R²: 0.85, RMSE: 0.83 mm d⁻¹, MAE: 0.60 mm d⁻¹), KNN (R²: 0.89, RMSE: 0.72 mm d⁻¹, MAE: 0.46 mm d⁻¹), and SVR (R²: 0.85, RMSE: 0.84 mm d⁻¹, MAE: 0.59 mm d⁻¹).
- The relative uncertainty of stkENS in the ADB was 10.44%, which was considerably lower than that of KNN (14.77%), SVR (16.95%), GLM (23.29%), and DT (34.49%).
- stkENS mean ET estimates were consistent with the FAO56-PM ETo for cotton (3 mm d⁻¹) and sorghum (3 mm d⁻¹), with a minor variance of 0.01 mm d⁻¹ for rice.
- Feature importance analysis revealed that temperature-based inputs (Tmax, Tmin, es, Rn) were the most dominant features for DT, KNN, and SVR, while wind speed (U2) was most important for GLM. SHapley Additive exPlanation (SHAP) analysis indicated that KNN had the highest positive impact on the stkENS hybridization.
- The stkENS model exhibited robust generalizability and consistent high performance across various input combinations and prediction scenarios for cotton, rice, and sorghum fields, highlighting its effectiveness in integrating the strengths of individual ML models.
Contributions
- Introduction and operationalization of a novel standalone stacking ensemble (stkENS) machine learning model for enhanced and accurate ETo prediction in data-limited arid regions, specifically the Amu Darya basin.
- Demonstration of the feasibility and effectiveness of hybridizing both strong and weak learners (DT, GLM, KNN, SVR with XGBoost as meta-learner) to achieve superior predictive accuracy with minimal input variables.
- Development of a robust and generalizable model suitable for improving irrigation scheduling and water use efficiency in ungauged arid croplands, addressing critical water scarcity challenges.
- Provision of valuable insights into optimal input variable combinations and the interpretability of individual model contributions to the ensemble through SHAP analysis.
- Highlighting the importance of expanding stacked generalization methods to increase understanding of their applicability across diverse dry ecological regions under changing environmental conditions.
Funding
- National Natural Science Foundation of China (Grant No.: W2433110)
- Tianshan Talent Cultivation (Grant No. 2022TSYCLJ0001)
- Key Projects of the Natural Science Foundation of Xinjiang Autonomous Region (Grant No. 2022D01D01)
Citation
@article{Ochege2025Enhancing,
author = {Ochege, Friday Uchenna and Yuan, Xiuliang and Luo, Geping},
title = {Enhancing reference crop evapotranspiration prediction in arid regions: A stacking ensemble learning approach for the Amu Darya basin},
journal = {Smart Agricultural Technology},
year = {2025},
doi = {10.1016/j.atech.2025.101554},
url = {https://doi.org/10.1016/j.atech.2025.101554}
}
Original Source: https://doi.org/10.1016/j.atech.2025.101554