Wang et al. (2021) A machine learning framework to improve effluent quality control in wastewater treatment plants
Identification
- Journal: The Science of The Total Environment
- Year: 2021
- Date: 2021-04-16
- Authors: Dong Wang, Sven Thunéll, Ulrika Lindberg, Lili Jiang, Johan Trygg, Mats Tysklind, Nabil Souihi
- DOI: 10.1016/j.scitotenv.2021.147138
Research Groups
- Department of Chemistry, Umeå University, Sweden
- Vakin, Umeå, Sweden
- Department of Computing Science, Umeå University, Sweden
Short Summary
This study developed a novel machine learning framework to improve effluent quality control in wastewater treatment plants by clarifying the complex relationships between operational variables and effluent parameters, rigorously accounting for time lags between processes. The framework, applied to the Umeå WWTP, revealed key operational factors influencing Total Suspended Solids and Phosphate in effluent, providing actionable insights for advanced process control.
Objective
- To develop a novel machine learning-based framework to improve effluent quality control in wastewater treatment plants (WWTPs) by clarifying the relationships between operational variables and effluent parameters.
- To investigate how operational factors affect effluent quality, specifically Total Suspended Solids in effluent (TSSe) and Phosphate in effluent (PO4e), while rigorously handling time lags between process steps.
Study Configuration
- Spatial Scale: Full-scale Umeå Wastewater Treatment Plant (WWTP) in Umeå, Sweden.
- Temporal Scale: Data collected over a period, compressed to 10-minute intervals, resulting in 105,763 samples.
Methodology and Data
- Models used:
- Random Forest (RF) models for primary analysis and interpretation.
- Deep Neural Network (DNN) models for validating RF model performance.
- Variable Importance Measure (VIM) analysis (Mean Decrease Impurity) to identify influential variables.
- Partial Dependence Plot (PDP) analysis to elucidate specific effects of important variables.
- Data sources: Online meters installed at various points within the Umeå WWTP, providing high-resolution data (initially 1 ms per sample, averaged to 10-minute intervals). The dataset comprised 105,763 samples with 34 variables, including 2 effluent parameters (TSSe, PO4e) and 32 operational variables. A novel approach was used to account for time lags between process variables by transforming time-series data into batch-series data.
Main Results
- Both Random Forest (RF) and Deep Neural Network (DNN) models demonstrated excellent performance, with R² values above 0.86 on the test datasets for both TSSe and PO4e, indicating reliable capture of relationships between operational variables and effluent parameters.
- Influent temperature (TTin) was identified as the most influential variable for both TSSe and PO4e, but its effects on these parameters differ. For TSSe, increasing TTin (6–16 °C) generally decreased TSSe, while for PO4e, increasing TTin (6–10.2 °C) decreased PO4e, but further increases led to fluctuations and an uptrend after 14.4 °C, likely due to microbial competition.
- PO4e is highly dependent on the Total Suspended Solids (TSS) concentration in aeration basins (TSSa2, TSSa3, TSSa4). Increases in TSS concentration generally promote PO4 removal, but excessive TSS (e.g., TSSa4 > 1600 mg/L) can have negative effects.
- The impact of TSS in aeration basins on TSSe and PO4e generally increases with the distance of the basin from the merging outlet (TSSa3 and TSSa4 are more influential than TSSa1 and TSSa2).
- Returning excessive amounts of sludge through the second return sludge pipe (FTsr > 28 m³/h) should be avoided due to its adverse impact on TSSe removal.
Contributions
- Presents a novel, interpretable machine learning framework for comprehensive understanding and control of WWTP processes, moving beyond mere prediction or soft sensor development.
- Introduces a rigorous method for handling time lags between process variables, crucial for accurate interpretation of cause-and-effect relationships in dynamic WWTP systems.
- Provides specific, actionable insights into the operational variables influencing effluent quality (TSSe and PO4e) at a full-scale WWTP, which can inform the development of advanced control strategies.
- Demonstrates the potential to increase control precision and reduce running costs in WWTPs by optimizing operational parameters based on data-driven insights.
- The developed framework is generalizable and applicable to other effluent parameters and industrial processes, provided sufficient high-resolution data are available.
Funding
- Green Technology and Environmental Economics (GreenTEE) initiative at Umeå University, Sweden.
- Green TEE platform.
Citation
@article{Wang2021machine,
author = {Wang, Dong and Thunéll, Sven and Lindberg, Ulrika and Jiang, Lili and Trygg, Johan and Tysklind, Mats and Souihi, Nabil},
title = {A machine learning framework to improve effluent quality control in wastewater treatment plants},
journal = {The Science of The Total Environment},
year = {2021},
doi = {10.1016/j.scitotenv.2021.147138},
url = {https://doi.org/10.1016/j.scitotenv.2021.147138}
}
Original Source: https://doi.org/10.1016/j.scitotenv.2021.147138