Wang et al. (2026) Flash flood forecasting in North East England through weak label-guided mixture of experts with multi-scale explainability

Identification

Journal: Journal of Hydrology Regional Studies
Year: 2026
Date: 2026-03-31
Authors: Jessica Wang, J.E. Sanderson, Wai Lok Woo
DOI: 10.1016/j.ejrh.2026.103402

Research Groups

School of Computer Science, Northumbria University, Newcastle Upon Tyne, United Kingdom

Short Summary

This study introduces a Weak Label–Guided Mixture of Experts (WL–MoE) framework for multi-horizon flash flood water-level forecasting in five fast-response catchments in North East England. The framework significantly improves predictive accuracy, particularly for high-water events, by leveraging specialized convolutional experts and a two-stage training scheme, while also providing multi-scale interpretability.

Objective

To develop and evaluate a Weak Label–Guided Mixture of Experts (WL–MoE) framework for robust, multi-horizon flash flood water-level forecasting that is accurate under severe regime imbalance and interpretable, particularly for high-impact events.

Study Configuration

Spatial Scale: Five fast-response catchments in Northumberland, North East England, along the River Tyne and its tributaries: Haltwhistle (W1), Acomb (W2), Riding Mill (W3), Stocksfield (W4), and Hepscott (W5).
Temporal Scale: Water-level and rainfall records at 15-minute resolution. Data covers 2016–2025 for most sites and 2022–2025 for Stocksfield. The model performs 32-step water-level forecasting, corresponding to an 8-hour lead time.

Methodology and Data

Models used:
- Proposed: Weak Label–Guided Mixture of Experts (WL–MoE) framework, comprising:
  - Continuous Wavelet Transform (CWT) for input transformation (Morlet wavelet for water level, Mexican Hat wavelet for rainfall).
  - Soft gating network (Multilayer Perceptron) for input-conditional expert weighting.
  - Multiple convolutional expert networks (CNNs) with identical backbone architecture.
  - Two-stage training scheme: weak-label guided initialisation (using DTW-based TimeSeriesKMeans clustering) followed by fine-tuning.
  - Explainability suite: Macro-level expert-usage profiling and micro-level Grad-CAM saliency maps.
- Baselines: Multilayer Perceptron (MLP), Long Short-Term Memory network (LSTM), Convolutional Neural Network (CNN), Informer (Transformer-based), TimesNet (multi-period time-series), TsMixer (all-MLP mixing).
Data sources: Public hydrometric archive from the UK Department for Environment, Food and Rural Affairs (DEFRA).

Main Results

WL–MoE significantly improved mean Nash–Sutcliffe Efficiency (NSE) from 0.8344 to 0.9008 on the full test set and from 0.3942 to 0.7285 on the high-water subset, relative to the strongest baseline (TsMixer average).
The largest performance gains were observed during rapidly rising high-water events, indicating that specialized experts better represent flood-onset and recession dynamics in flashy catchments.
The smooth soft-gating behavior suggests that hydrological states are transitional rather than sharply separated, contributing to more reliable and interpretable forecasts.
An expert pool of nine experts provided the optimal balance between predictive accuracy, stable expert specialization, and computational cost, with diminishing returns observed for larger pools.
Macro-level interpretability revealed consistent specialization of experts into distinct hydrometeorological niches. Micro-level Grad-CAM saliency maps highlighted physically meaningful time–scale features, such as high-frequency rainfall for flood onset and mid/low frequencies for recession.

Contributions

Introduction of WL-MoE, a weak-label-guided regime-conditional mixture-of-experts framework for flash-flood water-level forecasting, featuring a learned soft-gating function and a two-stage training design to stabilize expert specialisation under severe regime imbalance.
Proposal of a two-stage training scheme that counterbalances severe class skew, ensuring minority flood regimes are represented by dedicated experts and improving peak-level accuracy.
Unification of pattern-level expert-usage profiling with instance-level Grad-CAM saliency to provide global and local explanations that respect temporal order and align with domain expectations.

Funding

This project was funded by DEFRA (Department for Environment, Food and Rural Affairs) as part of the £200 million Flood and Coastal Innovation Programmes, managed by the Environment Agency.

Citation

@article{Wang2026Flash,
  author = {Wang, Jessica and Sanderson, J.E. and Woo, Wai Lok},
  title = {Flash flood forecasting in North East England through weak label-guided mixture of experts with multi-scale explainability},
  journal = {Journal of Hydrology Regional Studies},
  year = {2026},
  doi = {10.1016/j.ejrh.2026.103402},
  url = {https://doi.org/10.1016/j.ejrh.2026.103402}
}

Original Source: https://doi.org/10.1016/j.ejrh.2026.103402