Rehbein (2026) Reconstructing nineteenth-century Danube river water levels with transformer-based computer vision
Identification
- Journal: Earth system science data
- Year: 2026
- Date: 2026-03-10
- Authors: Malte Rehbein
- DOI: 10.5194/essd-18-1783-2026
Research Groups
- Chair of Computational Humanities, University of Passau, Passau, Germany
- Max Planck Institute of Geoanthropology, Jena, Germany
Short Summary
This study developed a semi-automated workflow using transformer-based computer vision to convert nineteenth-century hand-drawn Bavarian Danube gauge charts into daily water-level series. The method achieved high accuracy (mean composite score 0.979) across three representative gauges while reducing manual effort by an order of magnitude, providing openly available, transparently documented historical hydrological data.
Objective
- To convert nineteenth-century Bavarian Danube water-level charts (1826–1894) into daily gauge-level series, expressed in physical units (millimetres) and referenced to the respective gauge zero, using a novel semi-automated workflow.
Study Configuration
- Spatial Scale: Bavarian Danube river, focusing on three representative gauges: Neu-Ulm, Vilshofen, and Passau.
- Temporal Scale: 1826–1894, generating daily water-level series.
Methodology and Data
- Models used: Transformer-based computer vision, specifically LineFormer (a deep learning architecture for line extraction). The workflow (HWLR) combines light, grid-aware pre-processing, optional dewarping, transformer-based line extraction, pixel-to-curve calibration, and targeted human checks.
- Data sources: Archival nineteenth-century Bavarian Danube gauge charts (hand-drawn annual hydrographs on pre-printed forms) from the Bayerisches Hauptstaatsarchiv (BayHStA).
Main Results
- The semi-automated pipeline achieved a high series-level accuracy with a mean composite score of 0.979 across three representative gauges (Neu-Ulm, Vilshofen, Passau).
- Manual effort for digitisation was reduced by roughly an order of magnitude (factors 6.10 to 9.62) compared to fully manual approaches.
- The workflow produces versioned datasets with page-level provenance, confidence scores, and methodological descriptors, ensuring transparency and reusability.
- A validation set of five unseen gauge years yielded a mean accuracy of 0.954 on the Custom peak-aware score.
Contributions
- Development of a pragmatic, semi-automated workflow (HWLR) for historical hydrometric data rescue, combining pre-processing, transformer-based line extraction (LineFormer), and pixel-to-curve calibration with human checks.
- Creation of a curated ground-truth sample and an evaluation protocol that includes standard errors (RMSE/MAE, Pearson's r) and a novel peak-aware composite score.
- A case study demonstrating high series-level fidelity and substantial reduction in manual digitisation effort for nineteenth-century Bavarian Danube gauge charts.
- Open release of the reconstructed, versioned dataset with month-level provenance, uncertainty flags, and code pointers to facilitate verification and reuse.
Funding
- The article processing charges for this open-access publication were covered by the Max Planck Society.
Citation
@article{Rehbein2026Reconstructing,
author = {Rehbein, Malte},
title = {Reconstructing nineteenth-century Danube river water levels with transformer-based computer vision},
journal = {Earth system science data},
year = {2026},
doi = {10.5194/essd-18-1783-2026},
url = {https://doi.org/10.5194/essd-18-1783-2026}
}
Original Source: https://doi.org/10.5194/essd-18-1783-2026