Raoult et al. (2025) Parameter Estimation in Land Surface Models: Challenges and Opportunities With Data Assimilation and Machine Learning
Identification
- Journal: Journal of Advances in Modeling Earth Systems
- Year: 2025
- Date: 2025-10-28
- Authors: Nina Raoult, Natalie Douglas, Natasha MacBean, Jana Kolassa, Tristan Quaife, Andrew Roberts, Rosie A. Fisher, Istem Fer, Cédric Bacour, Katherine Dagon, Linnia Hawkins, Nuno Carvalhais, Elizabeth Cooper, Michael C. Dietze, Pierre Gentine, Thomas Kaminski, Daniel Kennedy, Hannah M. Liddy, D. J. Moore, Philippe Peylin, Ewan Pinnington, Benjamin M. Sanderson, Marko Scholze, Christian Seiler, T. Luke Smallman, Noemi Vergopolan, Toni Viskari, Mathew Williams, John M. Zobitz
- DOI: 10.1029/2024ms004733
Research Groups
- Department of Mathematics and Statistics, Faculty of Environment, Science and Economy, University of Exeter, UK
- European Centre for Medium‐Range Weather Forecasts (ECMWF), UK
- National Centre for Earth Observation, Department of Meteorology, University of Reading, UK
- Department of Geography and Environment, Western University, Canada
- Department of Biology, Western University, Canada
- Global Modeling and Assimilation Office, NASA Goddard Space Flight Center, USA
- Science Systems and Applications, Inc., USA
- Computing and Data Sciences, Boston University, USA
- CICERO, Norway
- Finnish Meteorological Institute, Finland
- Laboratoire des Sciences du Climat et de l’Environnement (LSCE/IPSL), CEA‐CNRS‐UVSQ, Université Paris‐Saclay, France
- NSF National Center for Atmospheric Research (NCAR), USA
- Earth and Environmental Engineering Department, Columbia University, USA
- Max Planck Institute for Biogeochemistry, Germany
- Departamento de Ciências e Engenharia do Ambiente (DCEA), Faculdade de Ciências e Tecnologia (FCT), Universidade Nova de Lisboa, Portugal
- ELLIS Unit Jena, Germany
- UK Centre for Ecology and Hydrology, UK
- Department of Earth & Environment, Boston University, USA
- The Inversion Lab, Germany
- Columbia Climate School, Columbia University, USA
- NASA Goddard Institute for Space Studies, USA
- School of Natural Resources and the Environment, University of Arizona, USA
- Department of Physical Geography and Ecosystem Science, Lund University, Sweden
- School of Environmental Studies, Queen's University, Canada
- School of GeoSciences and National Centre for Earth Observation, University of Edinburgh, UK
- Ken Kennedy Institute, Rice University, USA
- Earth Environment and Planetary Sciences, Rice University, USA
- European Commission, Joint Research Center (JRC), Italy
- Department of Mathematics, Computer Science, and Data Science, Augsburg University, USA
Short Summary
This paper reviews the current state, challenges, and opportunities in parameter estimation for land surface models (LSMs) using data assimilation (DA) and machine learning (ML), particularly focusing on carbon-water-vegetation interactions. It highlights how ML can enhance computational efficiency and address poorly represented processes, advocating for international collaboration to improve LSM predictive capabilities.
Objective
- To review the progress made in using data assimilation (DA) for parameter optimization in land surface models (LSMs), with a focus on carbon-water-vegetation interactions.
- To discuss the technical challenges faced by the community in LSM parameter estimation.
- To outline how machine learning (ML) can help address these challenges and propose avenues for future work integrating ML and DA to reduce uncertainties in LSMs.
- To identify future priorities, including the need for international collaborations, to leverage Earth observation data and ML advances for enhanced LSM predictive capabilities.
Study Configuration
- Spatial Scale: Site-level, multi-site, regional, catchment scale, global (typically 0.5° or coarser grid resolution for global models, down to 500 m for satellite products).
- Temporal Scale: Half-hourly to centennial timescales, inter-annual to centennial timescales, seasonal cycles, multi-year periods, and historical periods spanning 100-1,000s of years.
Methodology and Data
- Models used: The paper reviews applications across a wide range of Land Surface Models (LSMs) and ecosystem models, including ORCHIDEE, JULES, CLM (Community Land Model), BETHY (Biosphere Energy Transfer Hydrology), DALEC (Data Assimilation Linked Ecosystem Carbon), SIPNET, TECOS, FöBAAR, CARDAMOM, LPJ-GUESS, BEPS, CABLE, CLASSIC, D&B, ED (Ecosystem Demography), ECLand, FATES, JSBACH, Noah, SDBM, and SiB.
- Data sources:
- Satellite: Reflectance, Normalized Difference Vegetation Index (NDVI), Solar-Induced Fluorescence (SIF), Vegetation Optical Depth (VOD), Surface Soil Moisture (SSM), Land Surface Temperature (LST), XCO2 (column-averaged carbon dioxide), above-ground biomass, burned area, snow cover, full-waveform lidar data (GEDI).
- Observation (in-situ): Eddy covariance flux tower data (CO2, energy flux), atmospheric CO2 mole fractions, biomass, soil carbon stocks, soil radiocarbon measurements, tree ring data (widths, isotopic data), river flow, precipitation, trace gas flux measurements (carbonyl sulfide, nitrous oxide, methane).
- Reanalysis: Meteorological forcing data (e.g., atmospheric reanalysis products).
- Experimental: Elevated CO2 experiments.
Main Results
- Data assimilation (DA) is a powerful tool for reducing parametric uncertainty in land surface models (LSMs), particularly for carbon-water-vegetation interactions.
- Key challenges in LSM parameter estimation include selecting sensitive parameters and their prior distributions, accurately characterizing model and observation errors (especially biases and correlations), developing robust observation operators, handling spatial and temporal heterogeneity, managing large and multiple observational datasets, and incorporating the computationally expensive pre-observation historical period into assimilation.
- Machine learning (ML) offers significant opportunities to address these challenges by:
- Reducing computational costs through model emulation (e.g., parameter perturbation emulators, history matching), making computationally intensive calibration techniques like MCMC feasible.
- Improving model representation via hybrid modeling, where ML substitutes uncertain or missing parameterizations (e.g., human processes, poorly understood physical processes) or generates improved spatial parameterizations.
- Developing ML-generated observation operators that are simpler, accommodate multiple observation types, correct climatological biases, and facilitate assimilation of low-level radiance observations.
- Serving as diagnostic tools to identify and characterize model structural errors by analyzing systematic differences between physical model components and their ML counterparts.
- Enhancing the optimization process itself by speeding up search, automatically tuning algorithm hyperparameters, and facilitating model differentiability (e.g., through language translation or neural network emulation of tangent linear/adjoint models).
- Future priorities include testing novel data sets (e.g., manipulation experiments, soil carbon stocks, tree rings, GEDI lidar, trace gas fluxes) and experimental configurations (e.g., multi-site vs. single-site, PFT-dependent vs. regional parameters, impact of record length), identifying and improving structural errors, and moving towards fully coupled land surface-atmospheric transport and Earth System Model (ESM) coupling with comprehensive uncertainty quantification.
- International collaboration, fostered by initiatives like the AIMES Land DA Working Group and the International Land Model Forum, is crucial for knowledge exchange, training, and developing standardized methods and shared toolboxes to accelerate progress in LSM calibration and DA.
Contributions
This review article synthesizes the current state of parameter estimation in land surface models (LSMs), providing a comprehensive overview of the progress made with data assimilation (DA) techniques over the past two decades. Its original value lies in: - Systematically outlining the key technical challenges that the LSM DA community currently faces, particularly concerning carbon-water-vegetation interactions. - Proposing a structured framework for how recent advancements in machine learning (ML) can be integrated into the DA workflow to overcome these challenges, offering specific applications such as emulators, hybrid models, and optimization process improvements. - Identifying critical future priorities and research directions, including the testing of novel observational datasets, the need for rigorous testing of DA experimental configurations, strategies for identifying and improving model structural errors, and the ultimate goal of fully coupled Earth System Model (ESM) assimilation with uncertainty quantification. - Emphasizing the crucial role of international collaboration, knowledge sharing, and the development of community toolboxes to accelerate progress and standardize methodologies in this rapidly evolving field.
Funding
- H2020 Marie Skłodowska‐Curie Actions (grant no. 101026422)
- UKRI National Centre for Earth Observation (International Science Programme, NE/X006328/1; LTSS programme, NE/R016518/1; UK's EO Climate Information Service, NE/X019071/1)
- NSF MSB (2406258)
- NASA CMS (80NSSC21K0965)
- National Science Foundation (NSF) Science and Technology Center (STC) Learning the Earth with Artificial Intelligence and Physics (LEAP), Award # 2019625‐STC
- U.S. Department of Energy, Office of Biological & Environmental Research (BER), under Lawrence Livermore National Lab subaward DE‐AC52‐07NA27344, Lawrence Berkeley National Lab subaward DE‐ACE02‐05CH11231, and Pacific Northwest National Lab subaward DE‐AC05‐76RL01830
- National Science Foundation (NSF) National Center for Atmospheric Research (NCAR), Cooperative Agreement No. 1852977
- Wolfe‐Western Fellowship At‐Large for Outstanding Newly Recruited Research Scholars Endowment Fund
- Research Council of Finland (Grant 337552)
- Horizon Europe, HORIZON‐MISS‐2022‐SOIL‐01‐05 (Grant Agreement 101082194)
- ESA (contract 4000141232 within the Carbon Science Cluster)
- Swedish strategic research areas: ModElling the Regional and Global Earth system (MERGE), the e‐science collaboration (eSSENCE), and Biodiversity and Ecosystems in a Changing Climate (BECC).
Citation
@article{Raoult2025Parameter,
author = {Raoult, Nina and Douglas, Natalie and MacBean, Natasha and Kolassa, Jana and Quaife, Tristan and Roberts, Andrew and Fisher, Rosie A. and Fer, Istem and Bacour, Cédric and Dagon, Katherine and Hawkins, Linnia and Carvalhais, Nuno and Cooper, Elizabeth and Dietze, Michael C. and Gentine, Pierre and Kaminski, Thomas and Kennedy, Daniel and Liddy, Hannah M. and Moore, D. J. and Peylin, Philippe and Pinnington, Ewan and Sanderson, Benjamin M. and Scholze, Marko and Seiler, Christian and Smallman, T. Luke and Vergopolan, Noemi and Viskari, Toni and Williams, Mathew and Zobitz, John M.},
title = {Parameter Estimation in Land Surface Models: Challenges and Opportunities With Data Assimilation and Machine Learning},
journal = {Journal of Advances in Modeling Earth Systems},
year = {2025},
doi = {10.1029/2024ms004733},
url = {https://doi.org/10.1029/2024ms004733}
}
Original Source: https://doi.org/10.1029/2024ms004733