Baruah et al. (2025) Interpretable machine learning for predicting rating curve parameters using channel geometry and hydrological attributes across the United States
Identification
- Journal: Scientific Reports
- Year: 2025
- Date: 2025-12-02
- Authors: Anupal Baruah, Reihaneh Zarrabi, Sagy Cohen, J. Michael Johnson, Riley McDermott
- DOI: 10.1038/s41598-025-27881-2
Research Groups
- Department of Geography and Environment, The University of Alabama, Tuscaloosa, USA
- NOAA Office of Water Prediction Affiliate, Tuscaloosa, USA
Short Summary
This study developed interpretable machine learning models to predict power-law rating curve parameters (α, β) across the CONtiguous United States (CONUS) stream network, demonstrating their sensitivity to channel geometry and hydrometeorological factors for improved flood risk assessment.
Objective
- To develop a data-driven, interpretable machine learning approach for predicting power-law rating curve parameters (α, β) across the CONtiguous United States (CONUS) stream networks.
- To predict rating curve parameters at a continental scale using large observational datasets.
- To explore the influence of channel geometry, topography, and hydrometeorological factors on stage-discharge relationships.
- To demonstrate the implications of predicted rating curve parameters for depth estimation from National Water Model return period flows and during historical flood events.
Study Configuration
- Spatial Scale: CONtiguous United States (CONUS), covering approximately 2.7 million river reaches of the NHDPlus stream network.
- Temporal Scale: HYDRoSWOT data from 1940-2014; 30-year mean annual precipitation data; application to hourly streamflow for a historical flood event (October 2016) and various return periods (2, 5, 10, 25, 50, and 100 years).
Methodology and Data
- Models used: Multivariate Regression (MVR), eXtreme Gradient Boosting (XGBoost/XGB), Random Forest Regressor (RFR), Support Vector Regression (SVR).
- Data sources:
- HYDRoacoustics in support of the Surface Water Oceanographic Topography (HYDRoSWOT) dataset (channel and flow attributes from ~10,000 USGS stations, 1940-2014).
- National Hydrography Dataset (NHDPlus v2.1) (catchment and stream properties for ~2.7 million flow reaches).
- STREAM-CATCHMENT (STREAMCAT) dataset (landscape metrics for NHDPlusV2 stream networks).
- Predicted channel hydraulic geometry attributes (bankfull/mean-flow width and depth) from Zarrabi et al. (2025).
- Predicted median sediment particle size (d50) from Abeshu et al. (2022).
- USGS National Water Information System (NWIS) for observed stage-discharge rating curves.
Main Results
- Power-law regression effectively represents USGS stage-discharge rating curves, with 97% of stations in CONUS exhibiting a coefficient of determination (R²) greater than 0.93.
- Tier-1 XGB models achieved high prediction accuracy for α (R²=0.66, pBIAS=0.16%) and β (R²=0.74, pBIAS=0.42%) at gauge sites.
- Tier-2 XGB models, applicable across the NHDPlus network, showed reasonable accuracy for α (R²=0.55, pBIAS=-1.71%) and β (R²=0.71, pBIAS=-0.49%), balancing accuracy with broad applicability.
- The rating curve coefficient (α) is highly sensitive to the channel width-to-depth ratio at mean and bankfull flow, river slope, elevation, aridity index, and precipitation, with higher values concentrated in low-elevation, high-rainfall, and high-aridity regions.
- The rating curve exponent (β) generally exhibits an opposite spatial trend to α, with higher values in elevated regions and lower values in coastal areas.
- Reconstructed rating curves using predicted parameters showed good agreement with USGS observations, with a median Root Mean Square Error (RMSE) of 0.175 meters for stage predictions.
- The framework enables continental-scale flow depth estimation for various National Water Model (NWM) return periods and historical flood events, supporting enhanced flood risk assessment.
Contributions
- Developed a novel two-tier interpretable machine learning framework for continental-scale prediction of power-law rating curve parameters (α, β) across the entire NHDPlus stream network in CONUS.
- Quantified the influence of channel geometry, topography, and hydrometeorological factors on rating curve parameters at a continental scale, addressing a significant research gap.
- Demonstrated the practical application of predicted rating curve parameters for estimating water depths at National Water Model return periods and for historical flood events, enhancing flood risk assessment and forecasting capabilities in data-scarce regions.
- Provided open-source datasets and code, promoting reproducibility and future research in hydrological modeling.
Funding
- National Oceanic & Atmospheric Administration (NOAA)
- Cooperative Institute for Research to Operations in Hydrology (CIROH)
- NOAA Cooperative Agreement with The University of Alabama (NA22NWS4320003)
Citation
@article{Baruah2025Interpretable,
author = {Baruah, Anupal and Zarrabi, Reihaneh and Cohen, Sagy and Johnson, J. Michael and McDermott, Riley},
title = {Interpretable machine learning for predicting rating curve parameters using channel geometry and hydrological attributes across the United States},
journal = {Scientific Reports},
year = {2025},
doi = {10.1038/s41598-025-27881-2},
url = {https://doi.org/10.1038/s41598-025-27881-2}
}
Original Source: https://doi.org/10.1038/s41598-025-27881-2