Qiao et al. (2025) Improving the accuracy of gridded snow depth estimation through multi-source data and a machine learning fusion model
Identification
- Journal: Scientific Reports
- Year: 2025
- Date: 2025-11-20
- Authors: Dejing Qiao, Xiaoxiao Chen, Jianmin Zhou, Shuang Liang, Guixiang Liu
- DOI: 10.1038/s41598-025-22347-x
Research Groups
- College of Surveying and Geo-Informatics, North China University of Water Resources and Electric Power, Zhengzhou, China
- Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
- College of Civil Engineering and Communication, North China University of Water Resources and Electric Power, Zhengzhou, China
- State Key Laboratory of Frozen Soil Engineering, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou, China
Short Summary
This study developed a Random Forest (RF) machine learning fusion method to improve the accuracy of gridded snow depth (SD) estimations over China from 2014 to 2018 by integrating multi-source SD data (ground-based, satellite-derived, reanalysis) and various environmental ancillary information. The fusion model significantly enhanced SD estimation accuracy, achieving a higher Kling-Gupta efficiency (KGE) and lower Root Mean Squared Error (RMSE) compared to individual input products.
Objective
- To explore whether the inclusion of multi-source snow depth (SD) products and environmental information into a machine learning model can contribute to the improvement of SD estimation.
Study Configuration
- Spatial Scale: China, gridded at 0.25° × 0.25° resolution.
- Temporal Scale: Daily, from 2014 to 2018.
Methodology and Data
- Models used: Random Forest (RF) algorithm for fusion. Input data sources include outputs from:
- HTESSEL (ERA-Interim)
- Catchment model (MERRA2)
- Noah model (GLDAS-NOAH)
- Optimal interpolation method and a simple snow melting/accumulation model (CMC)
- Empirical algorithms based on passive microwave brightness temperature (WESTDC)
- Data sources:
- Gridded Snow Depth (SD) Products:
- WESTDC SD dataset (satellite-derived, passive microwave)
- ERA-Interim SD dataset (reanalysis)
- MERRA2 SD dataset (reanalysis)
- GLDAS-NOAH SD dataset (land surface model simulation)
- CMC SD dataset (ground-based spatial interpolation, original resolution 24000 m)
- In-situ Observations: Daily in-situ SD data from China National Meteorological Information Center (CMA) at 945 stations across China (2014-2018).
- Ancillary Information:
- Land cover types (MODIS MCD12Q1, IGBP classification, original resolution 500 m)
- Forest cover fraction (calculated from IGBP land cover)
- Geographical information (elevation from GLOBE DEM, original resolution 1000 m; latitude, longitude)
- Land cover heterogeneity (Gini–Simpson index, GSI, calculated from IGBP land cover)
- Surface roughness (standard deviation of elevations from GLOBE DEM)
- Snow class (NSIDC snow cover classification system)
- Gridded Snow Depth (SD) Products:
Main Results
- The developed RF-SD fusion model significantly improved the accuracy of snow depth (SD) estimates over China.
- RF-SD data achieved a Kling-Gupta efficiency (KGE) of 0.73, an increase from the range of 0.21 to 0.64 for the five original SD datasets.
- RF-SD data exhibited a lower Root Mean Squared Error (RMSE) of 0.051 m, compared to the original datasets (e.g., WESTDC and MERRA2 had RMSEs of 0.083 m and 0.082 m, respectively).
- The RF-SD model showed better performance across different land cover types, forest cover fractions, land cover heterogeneity (GSI), surface roughness, and snow classifications compared to individual products.
- The performance of individual gridded SD products varied significantly with environmental conditions (e.g., passive microwave products had greater error in forest cover regions, CMC had large errors in grassland and ephemeral snow areas, reanalysis models had high error in forest areas).
- The inclusion of multiple input predictor variables (gridded SD data and environmental-related data) contributed to better SD estimations.
Contributions
- Developed a novel Random Forest (RF)-based fusion method that integrates multi-source snow depth (SD) products (ground-based, satellite-derived, reanalysis, and model simulations) with comprehensive environmental information (land cover, forest cover, geographical data, heterogeneity, roughness, snow class).
- Generated a high-accuracy gridded SD spatial distribution product for China from 2014 to 2018, demonstrating significant improvement over individual existing SD datasets.
- Provided a detailed evaluation of the performance of both original and fused SD products under diverse environmental and perturbing factors, highlighting the strengths and weaknesses of different SD estimation methods.
- Demonstrated the effectiveness of machine learning in dealing with complex nonlinear relationships for geophysical parameter product fusion, offering a robust approach applicable to other regions with complex topography and climate.
Funding
- National Natural Science Foundation of Henan Province (No. 232300420441)
- Postdoctoral Funding Project of Gansu Province (No. E33980232)
- Second Tibetan Plateau Scientific Expedition and Research (STEP) program (Grant No. 2019QZKK0201)
- National Natural Science Foundation of China (No. 42071084, 41974108)
- Henan Provincial Science and Technology Research Project (No. 222102320021)
Citation
@article{Qiao2025Improving,
author = {Qiao, Dejing and Chen, Xiaoxiao and Zhou, Jianmin and Liang, Shuang and Liu, Guixiang},
title = {Improving the accuracy of gridded snow depth estimation through multi-source data and a machine learning fusion model},
journal = {Scientific Reports},
year = {2025},
doi = {10.1038/s41598-025-22347-x},
url = {https://doi.org/10.1038/s41598-025-22347-x}
}
Original Source: https://doi.org/10.1038/s41598-025-22347-x