Ci et al. (2025) Multi-timescale evapotranspiration fusion: A novel autoencoder with automated machine learning-based approach for enhanced estimation accuracy
Identification
- Journal: Agricultural Water Management
- Year: 2025
- Date: 2025-12-16
- Authors: Mengtao Ci, Xingming Hao, Fan Sun, Qixiang Liang, Xue Fan, Jingjing Zhang, Haibing Xiong, Jinfan Xu, Xinran Guo
- DOI: 10.1016/j.agwat.2025.110086
Research Groups
- Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, China
- University of Chinese Academy of Sciences, Beijing, China
- Akesu National Station of Observation and Research for Oasis Agro-ecosystem, Akesu, Xinjiang, China
Short Summary
This study developed AGFusionET, a novel multi-timescale fusion model combining autoencoders and automated machine learning (AutoML), to integrate 20 heterogeneous evapotranspiration (ET) products. It generated a global, high-resolution (0.05 degrees) ET dataset for 1982–2023, demonstrating superior accuracy (Kling-Gupta Efficiency of 0.88, Root Mean Square Error of 12.12 mm/month) compared to benchmark products.
Objective
- To construct a next-generation global evapotranspiration (ET) dataset featuring high spatial resolution (0.05 degrees), extended temporal coverage (1982–2023), and global robustness.
- To advance ET fusion methodologies from traditional linear weighting towards a new paradigm centered on multimodality, automation, and deep feature-driven approaches by integrating autoencoders and automated machine learning (AutoML).
Study Configuration
- Spatial Scale: Global, with a final resolution of 0.05 degrees (approximately 5.6 km at the equator).
- Temporal Scale: 1982–2023, with monthly aggregation for analysis and validation.
Methodology and Data
- Models used:
- AGFusionET (Proposed): A multi-timescale fusion model integrating:
- Autoencoder: A symmetric deep autoencoder with a 12-dimensional bottleneck layer for nonlinear dimensionality reduction and latent feature extraction from high-dimensional ET products.
- AutoGluon: An automated machine learning (AutoML) framework for automated feature selection, ensemble model construction, and hyperparameter optimization. It uses a multilayer stacked ensembling architecture, incorporating 12 distinct base model types (e.g., k-nearest neighbors, Random Forest, LightGBM, CatBoost, XGBoost, NeuralNetFastAI, NeuralNetTorch).
- Comparison Models: Random Forest (RF), Gradient Boosting Regressor (GBR), Light-Gradient Boosting Machine (LGBM), and a Stacked Ensemble model.
- AGFusionET (Proposed): A multi-timescale fusion model integrating:
- Data sources:
- Eddy covariance observations: 585 datasets compiled from FLUXNET, AmeriFlux, EuroFlux, AsiaFlux, ChinaFlux, and two national field observation stations in Xinjiang, China (Aksu Oasis Farmland Ecosystem, Fukang Desert Ecosystem). These served as ground truth for model training and validation.
- Evapotranspiration (ET) products: 20 global ET or latent heat flux (LE) products, categorized into statistical/empirical ensemble, reanalysis-based, remote sensing (RS), process-based modeling, surface energy balance based, and water balance-based. All products were harmonized to 0.05 degrees spatial resolution and monthly temporal scale.
- Meteorological forcing data: ERA5-Land reanalysis product (0.1 degrees native spatial resolution, monthly temporal scale), including total precipitation (tp), 2 m air temperature (t2m), dew point temperature (d2m), 10 m u- and v-wind components (u10, v10), surface net solar radiation (ssr), surface net thermal radiation (str), soil moisture, and derived Vapour Pressure Deficit (VPD). Resampled to 0.05 degrees.
- Normalised Difference Vegetation Index (NDVI) data: Global Inventory Modelling and Mapping Studies (GIMMS) 3G dataset (1/12 degrees native resolution, 1982–2022) and MOD13C1 observations for 2023. Resampled to 0.05 degrees.
- Digital Elevation Model (DEM) data: GLOBE Digital Elevation Model (1 km spatial resolution).
- MODIS land cover type product (MCD12C1): IGBP classification (0.05 degrees spatial resolution, 2001–2023), used to identify dominant land cover types and snow/ice.
Main Results
- Superior Performance: AGFusionET consistently outperformed all 20 benchmark ET products and comparison machine learning models across various validation metrics and scales.
- High Accuracy: Achieved a Kling-Gupta Efficiency (KGE) of 0.88 and a Root Mean Square Error (RMSE) of 12.12 mm/month when validated against monthly ET observations from all available flux tower sites.
- Robust Generalization: On an independent validation set, AGFusionET models showed RMSE values ranging from 16.4 to 16.6 mm/month and KGE scores between 0.835 and 0.855, demonstrating strong robustness and extrapolation capabilities to unseen spatial domains.
- Land Cover Adaptability: AGFusionET achieved the lowest RMSE and smallest uncertainty across all five IGBP land cover types (croplands: 16.63 mm/month; grasslands: 14.35 mm/month; shrublands: 11.08 mm/month; savannas: 13.09 mm/month; forests: 13.59 mm/month), indicating strong adaptability to diverse ecosystems.
- Spatial and Temporal Consistency: The generated global ET dataset (1982–2023, 0.05 degrees) exhibited enhanced spatio-temporal continuity and harmonized inter-product consistency.
- Uncertainty Reduction: AGFusionET showed the lowest uncertainty (10.9 mm/year) among all evaluated products, significantly reducing the uncertainty compared to individual ET products.
- Trend Representation: When validated against a water-balance-based ET (ETWB) dataset for 56 major river basins, AGFusionET exhibited the best overall trend representation (Nash–Sutcliffe Efficiency of 0.409) compared to other reanalysis and model-driven products.
Contributions
- Introduces AGFusionET, a novel and generalisable framework for multi-source evapotranspiration (ET) data fusion, leveraging autoencoders for deep feature extraction and AutoML for automated model selection and hyperparameter optimization.
- Generates a high-quality, long-term (1982–2023), and high-resolution (0.05 degrees) global ET dataset with significantly improved spatio-temporal continuity and cross-regional robustness.
- Demonstrates superior accuracy and reliability in ET estimation across diverse ecosystems, particularly in arid and high-latitude regions, outperforming 20 existing benchmark ET products.
- Effectively addresses critical challenges in multi-source ET data integration, such as temporal misalignment, input redundancy, and biased propagation, through a batchwise modeling strategy and deep feature integration.
- Provides a reliable and unified foundational dataset that can support various applications, including hydrological modeling, drought monitoring, water resources management, and climate change research.
Funding
- Strategy Priority Research Program (Category B) of the Chinese Academy of Sciences [grant number XDB0720101]
Citation
@article{Ci2025Multitimescale,
author = {Ci, Mengtao and Hao, Xingming and Sun, Fan and Liang, Qixiang and Fan, Xue and Zhang, Jingjing and Xiong, Haibing and Xu, Jinfan and Guo, Xinran},
title = {Multi-timescale evapotranspiration fusion: A novel autoencoder with automated machine learning-based approach for enhanced estimation accuracy},
journal = {Agricultural Water Management},
year = {2025},
doi = {10.1016/j.agwat.2025.110086},
url = {https://doi.org/10.1016/j.agwat.2025.110086}
}
Original Source: https://doi.org/10.1016/j.agwat.2025.110086