He et al. (2025) Hybrid Lake Model (HyLake) v1.0: unifying deep learning and physical principles for simulating lake-atmosphere interactions
Identification
- Journal: Geoscientific model development
- Year: 2025
- Date: 2025-12-01
- Authors: Yuan He, Xiaofan Yang
- DOI: 10.5194/gmd-18-9257-2025
Research Groups
- State Key Laboratory of Earth Surface Processes and Disaster Risk Reduction, Faculty of Geographical Science, Beijing Normal University, Beijing, China
- Guangdong Provincial Observation and Research Station for Coupled Human and Natural Systems in Land-ocean Interaction Zone, Beijing Normal University at Zhuhai, Zhuhai, China
Short Summary
This study introduces HyLake v1.0, a novel hybrid lake model that unifies physics-based surface energy balance equations with a Bayesian Optimized Bidirectional Long Short-Term Memory-based (BO-BLSTM-based) surrogate to simulate lake surface temperature (LST) dynamics. The model demonstrates superior performance in simulating lake-atmosphere interactions and strong generalization and transferability to ungauged sites and with unlearned forcing datasets compared to traditional and other hybrid models.
Objective
- To develop a novel hybrid lake model, HyLake v1.0, by embedding an LSTM-based surrogate into a process-based lake model.
- To validate the performance of HyLake v1.0 in simulating LST, latent heat (LE), and sensible heat (HE) against observations from the Taihu Lake Eddy Flux Network.
- To evaluate the transferability of HyLake v1.0 to ungauged sites with varying biological characteristics using ECMWF Reanalysis v5 (ERA5) forcing datasets.
Study Configuration
- Spatial Scale: Lake Taihu (area: 2400 km², average depth: 1.9 m) with five distinct sites (Meiliangwan (MLW), Dapukou (DPK), Bifenggang (BFG), Xiaoleishan (XLS), Pingtaishan (PTS)). Lake Chaohu (area: 760 km², average depth: 3.06 m) for transferability assessment. ERA5 datasets have a spatial resolution of 0.25°.
- Temporal Scale: Data from 2012 to 2015. Observations collected at 30 min intervals. ERA5 datasets are hourly. Analysis of daily and hourly trends.
Methodology and Data
- Models used:
- Hybrid Lake Model v1.0 (HyLake v1.0): Proposed model, hard-coupling a process-based backbone (PBBM) with a Bayesian Optimized Bidirectional Long Short-Term Memory-based (BO-BLSTM-based) surrogate for LST approximation.
- Process-Based Backbone Model (PBBM): Simplified process-based lake model based on energy balance equations and 1-D vertical lake water temperature transport equations.
- Freshwater Lake (FLake) model: Traditional process-based bulk model for intercomparison.
- Baseline: Hybrid model with an LSTM-based surrogate trained on PBBM outputs.
- TaihuScene: Hybrid model with a BO-BLSTM-based surrogate trained on observations from all five Lake Taihu sites.
- Data sources:
- Taihu Lake Eddy Flux Network: Hydrometeorological variables (air humidity, air temperature, wind speed, net radiation components), latent heat (LE), sensible heat (HE), and inferred radiative lake surface temperature (LST) from five sites in Lake Taihu.
- ECMWF Reanalysis v5 (ERA5) datasets: Hourly meteorological variables (air temperature, dew point temperature, surface pressure, wind speed, surface net longwave and shortwave radiation) with 0.25° spatial resolution.
- MODIS/Terra Land Surface Temperature/Emissivity Daily L3 Global 1km SIN Grid V061 imageries (MYD11A1) for Lake Chaohu LST validation.
Main Results
- HyLake v1.0 significantly outperformed FLake and Baseline models at the MLW site, achieving an R of 0.99 and RMSE of 1.08 °C for LST, R of 0.94 and RMSE of 24.65 W m⁻² for LE, and R of 0.93 and RMSE of 7.15 W m⁻² for HE.
- In the Taihu-obs experiment (all Lake Taihu sites, observation-forced), HyLake v1.0 demonstrated superior overall performance with MAE values of 1.03 °C for LST, 24.79 W m⁻² for LE, and 7.88 W m⁻² for HE, outperforming both FLake and TaihuScene.
- In the Taihu-ERA5 experiment (all Lake Taihu sites, ERA5-forced), HyLake v1.0 maintained its superior performance with MAE values of 0.90 °C for LST, 35.02 W m⁻² for LE, and 7.97 W m⁻² for HE, outperforming FLake and TaihuScene in most variables across sites.
- HyLake v1.0 exhibited excellent transferability to the ungauged Lake Chaohu, showing an LST RMSE of 2.07 °C and MAE of 1.57 °C when forced with ERA5 data.
- The study found that HyLake v1.0, trained on a relatively smaller dataset (MLW site only), often outperformed TaihuScene, which was trained on a larger dataset (all Taihu sites), challenging the assumption that larger datasets always lead to improved deep-learning model performance.
Contributions
- Development of HyLake v1.0, a novel hybrid lake model that effectively integrates physics-based principles with deep learning (BO-BLSTM-based surrogate) for accurate and numerically stable simulation of lake surface temperature and heat fluxes.
- Demonstrated improved accuracy in modeling lake-atmosphere interactions (LST, LE, HE) compared to established process-based models (FLake) and other hybrid approaches.
- Validated the strong generalization and transferability of HyLake v1.0 to ungauged lake sites and with different forcing datasets (ERA5), highlighting its potential for broader application, especially in data-sparse regions.
- Provided insights into the effectiveness of training strategies for hybrid models, suggesting that carefully curated, representative smaller datasets can sometimes yield better performance than larger, more heterogeneous ones.
Funding
- Guangdong Provincial Observation and Research Station (grant no. 2024B1212040003)
- National Key R&D Program of China (grant no. 2023YFC3208905)
Citation
@article{He2025Hybrid,
author = {He, Yuan and Yang, Xiaofan},
title = {Hybrid Lake Model (HyLake) v1.0: unifying deep learning and physical principles for simulating lake-atmosphere interactions},
journal = {Geoscientific model development},
year = {2025},
doi = {10.5194/gmd-18-9257-2025},
url = {https://doi.org/10.5194/gmd-18-9257-2025}
}
Original Source: https://doi.org/10.5194/gmd-18-9257-2025