Colleoni et al. (2025) smash v1.0: a differentiable and regionalizable high-resolution hydrological modeling and data assimilation framework
Identification
- Journal: Geoscientific model development
- Year: 2025
- Date: 2025-10-10
- Authors: François Colleoni, Ngo Nghi Truyen Huynh, Pierre-André Garambois, Maxime Jay‐Allemand, Didier Organde, Benjamin Renard, Thomas de Fournas, Apolline El Baz, Julie Demargne, Pierre Javelle
- DOI: 10.5194/gmd-18-7003-2025
Research Groups
- INRAE, Aix-Marseille Université, RECOVER, Aix-en-Provence, France
- HYDRIS Hydrologie, Montferrier-sur-Lez, France
Short Summary
This paper introduces smash v1.0, an open-source, differentiable, and regionalizable framework for high-resolution hydrological modeling and data assimilation. It demonstrates the framework's capabilities in local calibration (median Kling–Gupta efficiency > 0.8 at 3 km resolution) and regionalization (Kling–Gupta efficiency > 0.6 and Nash–Sutcliffe efficiency > 0.6 at 500 m resolution) across various scales and model structures.
Objective
- To present the smash v1.0 framework, detailing its algorithms, open-source code, documentation, and tutorials.
- To benchmark smash's performance on state-of-the-art datasets, highlighting its readiness for scientific research and operational hydrological applications, particularly for high-resolution, spatially distributed modeling and data assimilation.
Study Configuration
- Spatial Scale: From catchment to country scales, with specific applications at 3 km (1′30′′) and 500 m spatial resolutions.
- Temporal Scale: Daily and hourly time steps, with applications covering periods of 1 to 14 years.
Methodology and Data
- Models used:
- Core Framework: smash (Spatially distributed Modeling and ASsimilation for Hydrology)
- Snow Operators (Msnw):
zero(no snow module),ssn(degree-day module) - Hydrological Operators (Mrr):
gr4,gr5,grd,loieau(GR-like conceptual models),vic3l(VIC-like conceptual model) - Routing Operators (Mhy):
lag0(instantaneous),lr(linear reservoir),kw(kinematic wave) - Regionalization: Descriptor-to-parameter neural networks (multi-layer perceptron) and multi-linear regression.
- Differentiation: Tapenade automatic differentiation engine for adjoint model derivation.
- Optimization: L-BFGS-B (limited-memory quasi-Newton) and Adam (adaptive learning rate).
- Interfacing: f90wrap for Fortran-to-Python interface.
- Data Assimilation: Gradient-based Variational Data Assimilation (VDA).
- Data sources:
- Global: CAMELS (Caravan-CAMELS) for discharge time series, ERA5 for atmospheric reanalysis (temperature, potential evapotranspiration), MSWEP for precipitation, MERIT Hydro IHU for flow directions, MERIT DEM for topographic slope, SoilGrids for soil properties (sand, clay content).
- France (Aude River): ANTILOPE J+1 (Météo-France) for precipitation, SAFRAN (Météo-France) for temperature and potential evapotranspiration, HydroDem for flow direction and topographic slope, CORINE Land Cover 2018 for land cover, Organde et al. (2013) for drainage density, BDLISA for karst percentage, HydroPortail Service Central Vigicrues for discharge time series.
Main Results
- Local Calibration Performance: Median Kling–Gupta efficiency (KGE) > 0.8 was achieved for daily GR-like and VIC-like model structures at 3 km resolution in spatially distributed calibration over CAMELS catchments. Spatially distributed calibration consistently outperformed spatially uniform calibration, even in temporal validation.
- Regionalization Performance (CONUS): Regionalization learning using Artificial Neural Networks (ANN) and multi-linear regression yielded KGE > 0.6 in spatiotemporal validation across CONUS (3 km resolution, daily time step), significantly improving over uniform parameterization.
- High-Resolution Regionalization Performance (Aude River, France): For a Mediterranean flash-flood-prone case, high-resolution hourly GR-like modeling at 500 m resolution achieved a Nash–Sutcliffe efficiency (NSE) > 0.6 in spatiotemporal validation using ANN-based regionalization.
- Computational Efficiency: Adjoint model runs were 6 to 12 times slower than direct model runs, with better thread scaling for adjoint runs. Memory usage ranged from 0.17 GB to 27 GB, scaling with domain size, demonstrating feasibility for large-scale applications. Checkpointing significantly reduced memory peaks during adjoint runs.
- Parameter Interpretability: Analysis of regionalized parameters revealed significant linear correlations between hydrological parameters (e.g., melt coefficient, production reservoir capacity, routing parameters) and physiographic descriptors (e.g., topographic slope, mean annual rainfall, humidity index, land cover).
- Spatially Distributed Gradients: smash accurately and efficiently computes spatially distributed cost gradients, crucial for high-dimensional optimization and interpretability.
Contributions
- Introduction of smash v1.0, the first open-source, fully differentiable, and regionalizable high-resolution hydrological modeling and data assimilation framework.
- Development of a modular operator chaining approach, allowing flexible combination of process-based conceptual models and hybrid physics-AI models (e.g., descriptor-to-parameter neural networks).
- Implementation of an efficient, differentiable Fortran solver with automatic adjoint model derivation (using Tapenade) for gradient-based optimization of large parameter vectors, supporting CPU parallel computing.
- Provision of a user-friendly Python interface (via f90wrap) and comprehensive open-source code, documentation, and tutorials to foster research and operational applications.
- Demonstration of the framework's capability for high-performance hydrological simulation and regionalization across diverse spatial and temporal scales using state-of-the-art global and national datasets.
- Enabling advanced data assimilation techniques, including simultaneous inference of high-dimensional, spatially distributed parameters and initial states, and the use of signature-based cost functions and spatial regularization.
Funding
- Agence Nationale de la Recherche MUFFINS project (grant no. ANR-21-CE04-0021-01)
Citation
@article{Colleoni2025smash,
author = {Colleoni, François and Huynh, Ngo Nghi Truyen and Garambois, Pierre-André and Jay‐Allemand, Maxime and Organde, Didier and Renard, Benjamin and Fournas, Thomas de and Baz, Apolline El and Demargne, Julie and Javelle, Pierre},
title = {smash v1.0: a differentiable and regionalizable high-resolution hydrological modeling and data assimilation framework},
journal = {Geoscientific model development},
year = {2025},
doi = {10.5194/gmd-18-7003-2025},
url = {https://doi.org/10.5194/gmd-18-7003-2025}
}
Original Source: https://doi.org/10.5194/gmd-18-7003-2025