Barzegar et al. (2026) Explaining Great Lakes water level variability through interpretable ensemble machine learning
Identification
- Journal: The Science of The Total Environment
- Year: 2026
- Date: 2026-01-01
- Authors: Rahim Barzegar, Ehsan Raei, Jan Adamowski
- DOI: 10.1016/j.scitotenv.2025.181302
Research Groups
- Groundwater Research Group (GRES), Research Institute on Mines and Environment (RIME), Université du Québec en Abitibi-Témiscamingue (UQAT), Amos, Québec, Canada
- Department of Bioresource Engineering, McGill University, Sainte-Anne-de-Bellevue, Quebec, Canada
- United Nations University Institute for Water, Environment and Health (UNU-INWEH), Richmond Hill, ON, Canada
Short Summary
This study develops an interpretable ensemble machine learning framework to quantify the immediate and lagged controls of environmental drivers on monthly water-level fluctuations in the Great Lakes (Superior, Michigan, Erie, Ontario). It reveals that boosting-based models and an ensemble approach significantly improve predictions, with inflow and outflow being dominant drivers, while temperature, evaporation, and runoff act as secondary, lake-specific modulators.
Objective
- To quantify the immediate and lagged controls of environmental drivers on monthly water-level fluctuations in Lakes Superior, Michigan, Erie, and Ontario using an interpretable multi-model machine learning framework.
Study Configuration
- Spatial Scale: Lakes Superior, Michigan, Erie, and Ontario (Great Lakes region).
- Temporal Scale: Monthly fluctuations over the period 1982–2022, with lagged predictors up to six months.
Methodology and Data
- Models used: Eight tree-based algorithms (Random Forest, Extra Trees, Gradient Boosting Regression Trees (GBRT), Histogram-Based Gradient Boosting (HGBRT), XGBoost, LightGBM, CatBoost, and AdaBoost) integrated into a Supervised Committee Machine Learning (SCML) ensemble. SHapley Additive exPlanations (SHAP) and Variogram Analysis of Response Surfaces (VARS) were used for interpretability.
- Data sources: Environmental drivers used as predictors include inflow, outflow, evaporation, runoff, and air temperature.
Main Results
- Boosting-based machine learning models (XGBoost, LightGBM, HGBRT) significantly improved Great Lakes water-level prediction compared to Random Forest and AdaBoost.
- The Supervised Committee Machine Learning (SCML) ensemble provided the most stable overall predictions, achieving Root Mean Square Error (RMSE) values as low as 0.118 m.
- The SHAP–VARS framework effectively identified dominant drivers and revealed their lagged sensitivities.
- Inflow and outflow are the overwhelming dominant drivers of lake-level dynamics across the Great Lakes.
- Evaporation, runoff, and air temperature act as secondary but lake-specific modulators of water levels.
- Runoff and inflow primarily govern water levels in Lakes Superior and Michigan, while inflow overwhelmingly drives Lake Erie's dynamics.
- Temperature and evaporation exert strong long-lag effects, particularly evident in Lake Ontario.
Contributions
- Development of an interpretable, multi-model machine learning framework for enhanced Great Lakes water-level prediction.
- Quantification of immediate and lagged controls of environmental drivers on lake levels using a novel SHAP–VARS framework.
- Identification of dominant and secondary environmental drivers and their lake-specific lagged sensitivities across the Great Lakes.
- Significant improvement in water-level prediction stability and accuracy through the application of a Supervised Committee Machine Learning (SCML) ensemble.
Funding
- Not mentioned in the provided paper text.
Citation
@article{Barzegar2026Explaining,
author = {Barzegar, Rahim and Raei, Ehsan and Adamowski, Jan},
title = {Explaining Great Lakes water level variability through interpretable ensemble machine learning},
journal = {The Science of The Total Environment},
year = {2026},
doi = {10.1016/j.scitotenv.2025.181302},
url = {https://doi.org/10.1016/j.scitotenv.2025.181302}
}
Original Source: https://doi.org/10.1016/j.scitotenv.2025.181302