Lin et al. (2026) Data-driven attribution of evapotranspiration dynamics in the Heihe River Basin: Controlling factors from site measurements to regional satellite observations
Identification
- Journal: Agricultural Water Management
- Year: 2026
- Date: 2026-04-02
- Authors: Ziqi Lin, Yu Feng, Jie Ying Wu, Daozhi Gong, Jinran Xiong, Xinde Cao, Chunmiao Zheng
- DOI: 10.1016/j.agwat.2026.110324
Research Groups
- School of Environmental Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
- School of the Environment and Sustainable Engineering, Eastern Institute of Technology, Ningbo, China
- Institute of Environment and Sustainable Development in Agriculture, Chinese Academy of Agricultural Sciences, Beijing, China
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo, China
Short Summary
This study quantifies scale-dependent evapotranspiration (ET) dynamics and their controlling factors in the Heihe River Basin by integrating decade-long in-situ flux measurements with multi-source satellite products using an interpretable ensemble machine learning framework. It reveals that while air temperature and leaf area index are primary drivers at the site scale, regional ET patterns are dominated by climatic factors with divergent sensitivities across satellite products, emphasizing the need for scale-aware water management strategies.
Objective
- Evaluate the performance of interpretable machine learning (IML) models in predicting evapotranspiration (ET) across site to regional scales.
- Quantitatively assess and compare the relative importance of various ET drivers at both site and regional scales.
- Explore the complex interactions between environmental drivers and their influence on ET.
Study Configuration
- Spatial Scale: Heihe River Basin (approximately 0.14 million km²). Site-scale (15 eddy covariance sites), regional-scale (500 m and 0.1° (approximately 11.1 km) spatial resolution for satellite products).
- Temporal Scale: 2013–2023 (decade-long). Data aggregated to 8-day intervals, with some daily analysis for comparison. Hourly data from ERA5-Land. Seasonal analysis (growing season, spring, summer, autumn).
Methodology and Data
- Models used:
- Ensemble Machine Learning Framework (Entropy-weighted TOPSIS ensemble)
- Base learners: Random Forest (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Support Vector Machine (SVM), Artificial Neural Network (ANN), CatBoost (CAT), Gradient Boosting Regression Trees (GBRT).
- Interpretable Machine Learning (IML) techniques: Shapley Additive explanations (SHAP), TreeSHAP for interactions.
- Generalized Additive Models (GAMs) for dependence plots.
- Penman-Monteith model (underlying MOD16A2H, PML v2), Penman model (underlying GLEAM4.2a).
- Data sources:
- In-situ observations: Eddy covariance (EC) flux measurements from 15 sites in the Heihe River Basin (2012–2023), including latent heat (LE), sensible heat (H), soil heat flux (G), net radiation (Rn), carbon dioxide concentration (CO2), solar radiation (Rg), wind speed (WS), air temperature (Ta), surface soil moisture (SSM), root zone soil moisture (RZSM). Data from National Tibetan Plateau Data Center.
- Satellite products:
- MOD16A2H (MODIS/Terra Net Evapotranspiration, 8-day, 500 m)
- GLEAM4.2a (Global Land Evaporation and Soil Moisture, daily, 0.1°)
- PML v2 (Penman-Monteith-Leuning model product, 8-day, 500 m)
- MCD15A2 (MODIS Leaf Area Index, 8-day, 500 m)
- Reanalysis data: ERA5-Land (hourly 2-meter Ta, 2-meter dew point temperature, WS, soil moisture at four depths, 0.1°).
- Other: Global 1 km resolution atmospheric carbon dioxide concentration dataset (2003-2023). 30 m Land cover map of the Heihe River Basin for 2015.
Main Results
- The ensemble machine learning model achieved superior predictive accuracy at the site scale (normalized Root Mean Squared Error = 0.297, coefficient of determination R² = 0.87, Pearson correlation coefficient r = 0.934).
- At the site scale, climate-related factors contributed 57.4% to ET variability, followed by vegetation (28.6%), soil moisture (9.6%), and CO2 (7.4%). Individually, air temperature (Ta, 34.3%) and leaf area index (LAI, 28.6%) were the dominant drivers.
- Specific physiological thresholds were identified: vapor pressure deficit (VPD) levels above 1.10 kPa were associated with reduced SHAP contributions, potentially reflecting stomatal regulation. Ta showed a tipping point at approximately 286.85 K (13.7 °C).
- Regionally, climatic factors dominated ET patterns (61–82% contribution) across satellite products, but driver sensitivities diverged: GLEAM4.2a was overwhelmingly driven by Ta (58%), while PML v2 uniquely captured CO2 fertilization feedbacks (15%).
- Spatial heterogeneity in ET regulatory mechanisms was evident: Ta showed basin-wide upstream dominance in GLEAM4.2a, but localized sensitivity in MOD16A2H and spatially heterogeneous sensitivities in PML v2.
- SHAP interaction analysis revealed that LAI exhibited the strongest overall interaction strength with other variables, particularly with VPD (0.34), Ta (0.30), and solar radiation (Rg, 0.24), indicating that vegetation density strongly modulates the effects of meteorological drivers.
Contributions
- Quantifies scale-dependent evapotranspiration (ET) dynamics and controlling factors in a heterogeneous, climate-homogeneous basin (Heihe River Basin) using an interpretable ensemble machine learning framework.
- Reconciles data-driven predictive power with mechanistic interpretability, providing insights into non-linear interactions and physiological thresholds (e.g., VPD > 1.10 kPa for stomatal regulation).
- Highlights the scale-dependency of CO2 signal detection, showing that products explicitly incorporating CO2 (like PML v2) better capture regional CO2-ET interactions compared to empirical models.
- Reveals strong spatial heterogeneity in ET regulatory mechanisms across the basin, emphasizing the nonstationary nature of land-atmosphere interactions and the need for differentiated modeling frameworks for natural and agricultural biomes.
- Underscores that agricultural water allocation strategies must account for non-linear interactions between vegetation structure, local climate, and elevated CO2 for food security and ecosystem sustainability.
Funding
- National Natural Science Foundation of China (grant no. 42571123)
- Funding agencies of Zhejiang Province and Ningbo Municipality through the program "Novel Technologies for Joint Pollution Reduction and Carbon Sequestration"
- Ningbo Natural Science Foundation (Grant No. 2025J026)
- Key Research and Development Program of Ningxia Hui Autonomous Region, China (2021BEG02006–02)
Citation
@article{Lin2026Datadriven,
author = {Lin, Ziqi and Feng, Yu and Wu, Jie Ying and Gong, Daozhi and Xiong, Jinran and Cao, Xinde and Zheng, Chunmiao},
title = {Data-driven attribution of evapotranspiration dynamics in the Heihe River Basin: Controlling factors from site measurements to regional satellite observations},
journal = {Agricultural Water Management},
year = {2026},
doi = {10.1016/j.agwat.2026.110324},
url = {https://doi.org/10.1016/j.agwat.2026.110324}
}
Original Source: https://doi.org/10.1016/j.agwat.2026.110324