Abdullah et al. (2026) Applications of machine learning in enhancing evaporation estimation for small reservoirs: a case study in semi-arid South Texas
Identification
- Journal: Modeling Earth Systems and Environment
- Year: 2026
- Date: 2026-04-10
- Authors: Syed Muhammad Fahad Abdullah, Chu-Lin Cheng, Jude A. Benavides, Jungseok Ho, R. P. Almeida
- DOI: 10.1007/s40808-026-02773-0
Research Groups
- Department of Civil Engineering, The University of Texas Rio Grande Valley, Edinburg, USA
- School of Earth, Environmental and Marine Sciences, The University of Texas Rio Grande Valley, Edinburg, USA
- O’Neill School of Public and Environmental Affairs, Indiana University Bloomington, Bloomington, USA
Short Summary
This study developed and validated a multi-reservoir machine learning (ML) framework to enhance daily open-water evaporation estimation for small reservoirs in semi-arid South Texas, demonstrating that Random Forest (RF) and Support Vector Regression (SVR) models significantly outperform traditional empirical methods.
Objective
- To apply, evaluate, and validate four machine learning models (Decision Tree, Random Forest, K-Nearest Neighbor, and Support Vector Regression) within a multi-reservoir training framework to estimate daily open-water evaporation from small reservoirs in the Lower Rio Grande Valley (LRGV) of semi-arid South Texas.
- To benchmark ML model performance against empirical models and a process-based combination method (DLEM), and validate them independently using both DLEM outputs and TexasETNet station observations.
Study Configuration
- Spatial Scale: Regional scale, focusing on the Lower Rio Grande Valley (LRGV) in semi-arid South Texas. Training and testing involved four reservoirs (Valley Acres, Loma Alta, Casa Blanca, Falcon) with surface areas ranging from approximately 1.4 km² to 354 km². Independent validation was conducted at Delta Lake (approximately 9.7 km²).
- Temporal Scale: Daily evaporation estimates were generated using data from 2018 to 2025 for model training and testing. Mean annual evaporation comparisons were based on data from 2018–2022/2023.
Methodology and Data
- Models used:
- Empirical Models: Penman, Penman-Monteith, Priestley-Taylor, Bowen Ratio Energy Budget (BREB).
- Benchmark Combination Model: Daily Lake Evaporation Model (DLEM).
- Machine Learning Models: Decision Tree (DT), Random Forest (RF), K-Nearest Neighbor (KNN), Support Vector Regression (SVR).
- Data sources:
- Gridded Meteorological Data: gridMET (Abatzoglou 2013) for daily maximum/minimum air temperature (°C), relative humidity (%), wind speed at 10 m (m s⁻¹), incoming shortwave radiation (W m⁻²), vapor pressure deficit (kPa), daily precipitation (mm), and reference evapotranspiration (ET₀; mm d⁻¹). Accessed via Google Earth Engine.
- Reservoir Attributes: Texas Water Development Board (TWDB) Groundwater Availability Model database for static descriptors: surface area (km²), average depth (m), maximum depth (m), and fetch length (km).
- Daily Evaporation Estimates (Benchmark): DLEM (Zhao et al. 2024) for training, testing, and validation.
- Observed Daily Evaporation (Independent Validation): Texas Evapotranspiration Network (TexasETNet) station observations (reference evapotranspiration, ET₀) at Delta Lake.
- Atmospheric Fields (Cross-check): Real-Time Mesoscale Analysis (RTMA) products.
Main Results
- Empirical Model Performance: Priestley-Taylor was the best-performing empirical model at Valley Acres Reservoir (R² = 0.33, RMSE = 2.45 mm d⁻¹, NSE = 0.19), while Penman was the weakest (R² = 0.22, RMSE = 3.60 mm d⁻¹, NSE = -0.73).
- ML Model Performance (Multi-reservoir testing, DLEM reference): Random Forest (RF) and Support Vector Regression (SVR) consistently outperformed DT and KNN. RF achieved the best accuracy (R² = 0.67, RMSE = 1.53 mm d⁻¹, NSE = 0.67), and SVR showed lower bias (Relative Bias = 0.37%).
- Independent Validation at Delta Lake:
- Against DLEM: RF showed the highest skill (R² = 0.78, RMSE = 1.22 mm d⁻¹), with SVR performing similarly (R² = 0.75, RMSE = 1.32 mm d⁻¹).
- Against TexasETNet: SVR slightly outperformed RF (R² = 0.33 vs. 0.32), demonstrating better transferability to independent station data.
- Mean Annual Evaporation: ML predictions (1620–1698 mm yr⁻¹) closely aligned with DLEM estimates (1620 mm yr⁻¹) at Delta Lake, exceeding TexasETNet observations (1452 mm yr⁻¹). ML and DLEM showed narrower interannual variability (± 300–350 mm yr⁻¹) compared to TexasETNet (± 700 mm yr⁻¹).
- Feature Importance: Shortwave radiation, maximum and minimum air temperature, and reference evapotranspiration (ET₀) were identified as the dominant drivers of evaporation. Reservoir attributes (surface area, average depth, maximum depth, fetch) played a smaller role.
- ML Improvement over Empirical Baselines: ML models, particularly RF and SVR, delivered substantial improvements in accuracy, bias reduction, and generalization compared to empirical models, achieving NSE values above 0.65.
Contributions
- Development and validation of a transferable multi-reservoir machine learning framework for accurate daily open-water evaporation estimation in data-scarce, semi-arid regions.
- Implementation of a robust dual validation strategy, comparing ML model predictions against both a process-based model (DLEM) and independent ground-based observations (TexasETNet), clarifying the influence of dataset characteristics on model evaluation.
- Demonstration that ensemble and kernel-based ML models (Random Forest and Support Vector Regression) significantly outperform traditional empirical methods in terms of predictive accuracy, bias reduction, and generalization for small reservoirs.
- Identification of key meteorological drivers of evaporation in semi-arid environments through feature importance and SHAP analyses, enhancing the physical interpretability of ML models.
- Provides a scalable and data-efficient approach to improve water management and reservoir operations in under-monitored semi-arid basins, supporting climate adaptation strategies.
Funding
- USDA Foundational and Applied Science Program (USDA#2023-67020-39704)
- Department of Civil Engineering, The University of Texas Rio Grande Valley
- School of Earth, Environmental, and Marine Sciences, The University of Texas Rio Grande Valley
Citation
@article{Abdullah2026Applications,
author = {Abdullah, Syed Muhammad Fahad and Cheng, Chu-Lin and Benavides, Jude A. and Ho, Jungseok and Almeida, R. P.},
title = {Applications of machine learning in enhancing evaporation estimation for small reservoirs: a case study in semi-arid South Texas},
journal = {Modeling Earth Systems and Environment},
year = {2026},
doi = {10.1007/s40808-026-02773-0},
url = {https://doi.org/10.1007/s40808-026-02773-0}
}
Original Source: https://doi.org/10.1007/s40808-026-02773-0