Yegin et al. (2025) Model Selection Challenges in Non-Stationary Precipitation Estimation: The Role of AIC, BIC, and Covariate Choice

Identification

Journal: Water Resources Management
Year: 2025
Date: 2025-12-23
Authors: Murat Yegin, Gülşah Karakaya, Elçin Kentel
DOI: 10.1007/s11269-025-04357-6

Research Groups

Department of Civil Engineering, Middle East Technical University, Ankara, Turkey
Department of Business Administration, Middle East Technical University, Ankara, Turkey

Short Summary

This study evaluates 50-year precipitation estimates from stationary and non-stationary models across 53 meteorological stations in Türkiye, analyzing the impact of model selection criteria (AIC, BIC), covariate choice, and probability distributions on predictive performance and the plausibility of extreme estimates. It reveals inconsistencies in model rankings between AIC and BIC, and highlights the trade-offs between model complexity, accuracy, and the risk of unrealistic extreme value predictions.

Objective

To evaluate the performance of stationary (S) and non-stationary (NS) models for 50-year annual maximum precipitation (AMP) estimation.
To assess the consistency of model rankings based on Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) across different probability distributions.
To analyze the effects of probability distributions and model types on 50-year return period precipitation (P50) estimates.
To identify suitable covariates for non-stationary AMP modeling in southern and central Türkiye, including two newly proposed covariates.

Study Configuration

Spatial Scale: 53 meteorological stations across southern and central Türkiye.
Temporal Scale: 50-year precipitation estimates, using daily precipitation and temperature data from 1976 to 2010 (35 years of observations).

Methodology and Data

Models used:
- Stationary (S) and Non-Stationary (NS) models (T0: S; T1: NS for location parameter; T2: NS for scale parameter; T3: NS for both location and scale parameters).
- Probability Distributions: Gamma, Generalized Extreme Value (GEV), Gumbel (Gu), and Log-normal.
- Covariates: Time (Y), maximum temperature (Tmax), North Atlantic Oscillation (NAO) index, number of days with maximum temperature exceeding the long-term average (DN), and meteorological drought magnitude index (MoMD).
- Software: "extRemes" package, Generalized Additive Model for Location Scale and Shape (GAMLSS) package.
- Parameter Estimation: Maximum Likelihood Estimation (MLE).
- Model Selection Criteria: Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC).
Data sources:
- Annual maximum precipitation (AMP) data from 53 meteorological stations.
- Daily precipitation and temperature data from 1976 to 2010.

Main Results

Model rankings based on AIC and BIC for the top 30 models showed inconsistencies across most meteorological stations, emphasizing the critical role of the performance criterion in model selection.
GEV and Gumbel distributions were most frequently selected as the best models by AIC and BIC, respectively.
Non-stationary (NS) models generally improved predictive performance, with AIC favoring more complex NS models and BIC favoring simpler models (including S models for 55% of stations).
Employing multiple probability distributions (Gamma, GEV, Gumbel, Log-normal) significantly increased the likelihood of obtaining well-performing NS models (approximately 80% of well-performing models were obtained when including Gumbel and Log-normal in addition to GEV).
NS GEV models sometimes produced unrealistically high 50-year precipitation (P50) estimates (e.g., 5 x 10^6 mm, 1 x 10^8 mm for some stations), highlighting a risk associated with this distribution.
P50 values obtained from AIC-based best models were generally greater than those from BIC-based models, suggesting AIC as a "safer" option for hydraulic design (leading to increased costs but enhanced safety).
The North Atlantic Oscillation (NAO) index and the newly proposed covariate, the number of days with maximum temperature exceeding the long-term average (DN), frequently appeared in well-performing NS models. Time (Y) was beneficial when incorporated into complex models, but not as a sole covariate.

Contributions

Conducted a comprehensive evaluation of 1020 stationary and non-stationary models per station across 53 meteorological stations, utilizing multiple probability distributions and two distinct model selection criteria (AIC and BIC).
Introduced and evaluated two novel covariates, the number of days with maximum temperature exceeding the long-term average (DN) and a meteorological drought magnitude index (MoMD), for non-stationary annual maximum precipitation (AMP) modeling.
Demonstrated significant inconsistencies in model rankings between AIC and BIC, providing new insights into their implications for model complexity and the physical plausibility of extreme precipitation estimates.
Highlighted the critical trade-offs between model accuracy, complexity, and the reliability of extreme value predictions, specifically identifying the risk of unrealistic estimates when using GEV in non-stationary models.
Emphasized the importance of considering multiple probability distributions in non-stationary modeling to avoid potential underestimation or overestimation of extreme precipitation values.

Funding

Scientific and Technological Research Council of Turkey (TUBITAK) under Grant Number 220N054.

Citation

@article{Yegin2025Model,
  author = {Yegin, Murat and Karakaya, Gülşah and Kentel, Elçin},
  title = {Model Selection Challenges in Non-Stationary Precipitation Estimation: The Role of AIC, BIC, and Covariate Choice},
  journal = {Water Resources Management},
  year = {2025},
  doi = {10.1007/s11269-025-04357-6},
  url = {https://doi.org/10.1007/s11269-025-04357-6}
}

Original Source: https://doi.org/10.1007/s11269-025-04357-6