Liu et al. (2025) An improved global river vector dataset based on multi-source river data fusion
Identification
- Journal: Scientific Data
- Year: 2025
- Date: 2025-12-12
- Authors: Yinxue Liu, Jianhua Wang, Changjun Liu, Yaohuan Huang, Yuanyuan Liu, Jie Liu, Fuxin Chai, Sheng Chen, Min Li, Wei Qu
- DOI: 10.1038/s41597-025-06399-2
Research Groups
- State Key Laboratory of Water Cycle and Water Security, China Institute of Water Resources and Hydropower Research, Beijing, China
- State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, China
- China Key Laboratory of River Basin Digital Twinning of Ministry of Water Resources, Beijing, China
Short Summary
This study introduces GSriver, a new global river vector dataset generated through a multi-source data fusion framework, which significantly enhances spatial accuracy while preserving complete topological information. Validation shows GSriver substantially improves positional accuracy compared to existing global datasets.
Objective
- To generate a high-spatial-accuracy global river dataset with complete topological information (GSriver) by fusing high-spatial-resolution OpenStreetMap (OSM) waterways with HydroRIVERS and supplementing missing segments with the Global River Topology (GRIT) dataset.
Study Configuration
- Spatial Scale: Global, with validation against national-scale high-precision datasets (e.g., NHDPlus for the United States).
- Temporal Scale: The dataset is a static product based on the 2022 version of OSM Waterways and continuously updated HydroRIVERS, representing a snapshot of river networks.
Methodology and Data
- Models used: A multi-source vector river data fusion framework was developed, involving HydroRIVERS reaches integration, OSM Waterways identification (using angular deviation, average distance filtering, overlap sifting), multi-data fusion (vertice-pairs establishment, coordinate replacement, geometry refinement, isolated vertices filtering), GRIT dataset supplementation, and topology repair.
- Data sources:
- Foundational: HydroRIVERS (8.47 million reaches, topologically complete).
- Primary Fusion: OpenStreetMap (OSM) Waterways (2022 version, 27.22 million rivers, high spatial precision in developed areas).
- Secondary Fusion/Supplement: Global River Topology (GRIT) (1.76 million reaches, for areas lacking OSM coverage).
- Validation Benchmark: NHDPlus (V2.0) for the United States (1:24,000 scale, 24.43 million reaches).
- Comparison: MERIT Hydro, GRIT, HydroRIVERS.
- Visual Inspection: Google Earth satellite imagery (approximately 1 meter resolution).
Main Results
- GSriver significantly improves spatial accuracy over MERIT, GRIT, and HydroRIVERS by 36.3%, 40.7%, and 56.7%, respectively, when validated against NHDPlus.
- More than 40% of GSriver vertices deviate less than 10 meters from NHDPlus, and 69.3% fall within 50 meters.
- 80.1% of GSriver rivers are located within 100 meters of NHDPlus, compared to 72.1% (MERIT), 74.5% (GRIT), and 11.3% (HydroRIVERS).
- The final GSriver dataset comprises 748,629 rivers, with 382,912,493 vertices. Of these, 57.3% originate from OSM waterways, 27.8% from GRIT, and 14.9% from the original HydroRIVERS.
- The dataset includes new attributes such as river length, fusion ratio, river name (from OSM), and an improved hierarchical river classification (8 levels).
- The dataset is publicly available at https://doi.org/10.6084/m9.figshare.30119851.v3.
Contributions
- Proposes a novel multi-source vector data fusion framework to systematically reconcile disparate resolution river datasets into a unified, high-precision global hydrographic product.
- Generates GSriver, a global river dataset that significantly enhances spatial accuracy by leveraging crowdsourced data (OSM) while preserving the topological completeness of DEM-derived datasets (HydroRIVERS and GRIT).
- Addresses the spatial accuracy limitations of conventional DEM-derived river datasets, particularly in flat terrains, by integrating high-resolution geometries.
- Offers a scalable and cost-effective solution for constructing large-scale river datasets, with a modular framework allowing for continuous integration of higher-resolution data.
- Provides a new river classification approach that better aligns with conventional usage and map representation, improving interpretability and standardization.
Funding
- National Natural Science Foundation of China (grant no. 52394235)
- Strategic Priority Research Program of the Chinese Academy of Sciences (Grant no. XDB0740202)
- Science and Technology Fundamental Resources Investigation Program (Grant no. 2023FY101000)
- Key Science and Technology Program Projects of the Ministry of Emergency Management (Grant no. 2025EMST020101)
Citation
@article{Liu2025improved,
author = {Liu, Yinxue and Wang, Jianhua and Liu, Changjun and Huang, Yaohuan and Liu, Yuanyuan and Liu, Jie and Chai, Fuxin and Chen, Sheng and Li, Min and Qu, Wei},
title = {An improved global river vector dataset based on multi-source river data fusion},
journal = {Scientific Data},
year = {2025},
doi = {10.1038/s41597-025-06399-2},
url = {https://doi.org/10.1038/s41597-025-06399-2}
}
Original Source: https://doi.org/10.1038/s41597-025-06399-2