Feng et al. (2025) A novel deep learning approach for high-precision rainfall intensity inversion using urban surveillance audio

Identification

Journal: Advances in Space Research
Year: 2025
Date: 2025-10-24
Authors: Jiangfan Feng, Xi Fu, Shaokang Dong
DOI: 10.1016/j.asr.2025.10.070

Research Groups

School of Artificial Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China
Key Laboratory of Tourism Multisource Data Perception and Decision, Ministry of Culture and Tourism (TMDPD, MCT), Chongqing University of Posts and Telecommunications, Chongqing, China
School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China

Short Summary

This paper introduces MS-TF RainNet, a novel deep learning framework for high-precision rainfall intensity inversion using urban surveillance audio, achieving an RMSE of 0.7708 mm/h and outperforming a Transformer-based baseline by 14.94% in RMSE under denoised conditions.

Objective

To develop a novel deep learning framework (MS-TF RainNet) that enables high-precision rainfall intensity inversion from urban surveillance audio, addressing limitations in current audio-based approaches regarding multi-scale pattern capture, time–frequency dependency modeling, and effective regression constraints.

Study Configuration

Spatial Scale: Urban environments, focusing on leveraging existing urban surveillance infrastructure.
Temporal Scale: Continuous, fine-grained monitoring of rainfall intensity.

Methodology and Data

Models used: MS-TF RainNet, a deep learning framework comprising:
- A hierarchical multi-scale feature extraction module processing Mel Frequency Cepstral Coefficients (MFCC) through parallel convolutional branches.
- A dual-domain attention mechanism combining temporal and frequency attention.
- Deep supervision with auxiliary regression heads.
Data sources: Surveillance Audio Rainfall Intensity Dataset (SARID).

Main Results

MS-TF RainNet achieved a Root Mean Square Error (RMSE) of 0.7708 mm/h and an R² of 0.8196 under denoised conditions.
It outperformed a Transformer-based baseline model by 14.94% in RMSE and 9.18% in R².
In noisy environments, the model maintained robustness with an RMSE of 0.8443 mm/h and an R² of 0.7983.

Contributions

Proposes MS-TF RainNet, a novel deep learning framework for high-precision rainfall intensity inversion from urban surveillance audio.
Introduces a hierarchical multi-scale feature extraction module to capture both local and global rainfall patterns.
Incorporates a dual-domain attention mechanism for transient noise suppression and spectral feature amplification.
Utilizes deep supervision with auxiliary regression heads to enforce hierarchical feature consistency and mitigate gradient vanishing.
Offers a cost-effective and transformative solution for urban hydrometeorology by leveraging existing surveillance infrastructure, outperforming conventional methodologies.

Funding

Not explicitly mentioned in the provided text.

Citation

@article{Feng2025novel,
  author = {Feng, Jiangfan and Fu, Xi and Dong, Shaokang},
  title = {A novel deep learning approach for high-precision rainfall intensity inversion using urban surveillance audio},
  journal = {Advances in Space Research},
  year = {2025},
  doi = {10.1016/j.asr.2025.10.070},
  url = {https://doi.org/10.1016/j.asr.2025.10.070}
}

Original Source: https://doi.org/10.1016/j.asr.2025.10.070