Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition

IntroductionAccurate and consistent data play a critical role in enabling health officials to make informed decisions regarding emerging trends in SARS-CoV-2 infections. Alongside traditional indicators such as the 7-day-incidence rate, wastewater-based epidemiology can provide valuable insights int...

Full description

Saved in:
Bibliographic Details
Main Authors: Cristina J. Saravia, Peter Pütz, Christian Wurzbacher, Anna Uchaikina, Jörg E. Drewes, Ulrike Braun, Claus Gerhard Bannick, Nathan Obermaier
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-12-01
Series:Frontiers in Public Health
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpubh.2024.1497100/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846126620686942208
author Cristina J. Saravia
Peter Pütz
Christian Wurzbacher
Anna Uchaikina
Jörg E. Drewes
Ulrike Braun
Claus Gerhard Bannick
Nathan Obermaier
author_facet Cristina J. Saravia
Peter Pütz
Christian Wurzbacher
Anna Uchaikina
Jörg E. Drewes
Ulrike Braun
Claus Gerhard Bannick
Nathan Obermaier
author_sort Cristina J. Saravia
collection DOAJ
description IntroductionAccurate and consistent data play a critical role in enabling health officials to make informed decisions regarding emerging trends in SARS-CoV-2 infections. Alongside traditional indicators such as the 7-day-incidence rate, wastewater-based epidemiology can provide valuable insights into SARS-CoV-2 concentration changes. However, the wastewater compositions and wastewater systems are rather complex. Multiple effects such as precipitation events or industrial discharges might affect the quantification of SARS-CoV-2 concentrations. Hence, analysing data from more than 150 wastewater treatment plants (WWTP) in Germany necessitates an automated and reliable method to evaluate data validity, identify potential extreme events, and, if possible, improve overall data quality.MethodsWe developed a method that first categorises the data quality of WWTPs and corresponding laboratories based on the number of outliers in the reproduction rate as well as the number of implausible inflection points within the SARS-CoV-2 time series. Subsequently, we scrutinised statistical outliers in several standard quality control parameters (QCP) that are routinely collected during the analysis process such as the flow rate, the electrical conductivity, or surrogate viruses like the pepper mild mottle virus. Furthermore, we investigated outliers in the ratio of the analysed gene segments that might indicate laboratory errors. To evaluate the success of our method, we measure the degree of accordance between identified QCP outliers and outliers in the SARS-CoV-2 concentration curves.Results and discussionOur analysis reveals that the flow and gene segment ratios are typically best at identifying outliers in the SARS-CoV-2 concentration curve albeit variations across WWTPs and laboratories. The exclusion of datapoints based on QCP plausibility checks predominantly improves data quality. Our derived data quality categories are in good accordance with visual assessments.ConclusionGood data quality is crucial for trend recognition, both on the WWTP level and when aggregating data from several WWTPs to regional or national trends. Our model can help to improve data quality in the context of health-related monitoring and can be optimised for each individual WWTP to account for the large diversity among WWTPs.
format Article
id doaj-art-05faa56bcca9401caeb3d8639440b844
institution Kabale University
issn 2296-2565
language English
publishDate 2024-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Public Health
spelling doaj-art-05faa56bcca9401caeb3d8639440b8442024-12-12T13:47:17ZengFrontiers Media S.A.Frontiers in Public Health2296-25652024-12-011210.3389/fpubh.2024.14971001497100Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognitionCristina J. Saravia0Peter Pütz1Christian Wurzbacher2Anna Uchaikina3Jörg E. Drewes4Ulrike Braun5Claus Gerhard Bannick6Nathan Obermaier7Wastewater Technology Research, Wastewater Disposal, German Environment Agency, Berlin, GermanyInfectious Disease Epidemiology, Surveillance, Robert-Koch-Institute, Berlin, GermanyChair of Urban Water Systems Engineering, Technical University of Munich, Garching, GermanyChair of Urban Water Systems Engineering, Technical University of Munich, Garching, GermanyChair of Urban Water Systems Engineering, Technical University of Munich, Garching, GermanyWastewater Analysis, Monitoring Methods, German Environment Agency, Berlin, GermanyWastewater Technology Research, Wastewater Disposal, German Environment Agency, Berlin, GermanyWastewater Technology Research, Wastewater Disposal, German Environment Agency, Berlin, GermanyIntroductionAccurate and consistent data play a critical role in enabling health officials to make informed decisions regarding emerging trends in SARS-CoV-2 infections. Alongside traditional indicators such as the 7-day-incidence rate, wastewater-based epidemiology can provide valuable insights into SARS-CoV-2 concentration changes. However, the wastewater compositions and wastewater systems are rather complex. Multiple effects such as precipitation events or industrial discharges might affect the quantification of SARS-CoV-2 concentrations. Hence, analysing data from more than 150 wastewater treatment plants (WWTP) in Germany necessitates an automated and reliable method to evaluate data validity, identify potential extreme events, and, if possible, improve overall data quality.MethodsWe developed a method that first categorises the data quality of WWTPs and corresponding laboratories based on the number of outliers in the reproduction rate as well as the number of implausible inflection points within the SARS-CoV-2 time series. Subsequently, we scrutinised statistical outliers in several standard quality control parameters (QCP) that are routinely collected during the analysis process such as the flow rate, the electrical conductivity, or surrogate viruses like the pepper mild mottle virus. Furthermore, we investigated outliers in the ratio of the analysed gene segments that might indicate laboratory errors. To evaluate the success of our method, we measure the degree of accordance between identified QCP outliers and outliers in the SARS-CoV-2 concentration curves.Results and discussionOur analysis reveals that the flow and gene segment ratios are typically best at identifying outliers in the SARS-CoV-2 concentration curve albeit variations across WWTPs and laboratories. The exclusion of datapoints based on QCP plausibility checks predominantly improves data quality. Our derived data quality categories are in good accordance with visual assessments.ConclusionGood data quality is crucial for trend recognition, both on the WWTP level and when aggregating data from several WWTPs to regional or national trends. Our model can help to improve data quality in the context of health-related monitoring and can be optimised for each individual WWTP to account for the large diversity among WWTPs.https://www.frontiersin.org/articles/10.3389/fpubh.2024.1497100/fullSARS-CoV-2data plausibilityautomated quality controlwastewater-based epidemiologywastewater treatment plant classificationoutlier detection
spellingShingle Cristina J. Saravia
Peter Pütz
Christian Wurzbacher
Anna Uchaikina
Jörg E. Drewes
Ulrike Braun
Claus Gerhard Bannick
Nathan Obermaier
Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition
Frontiers in Public Health
SARS-CoV-2
data plausibility
automated quality control
wastewater-based epidemiology
wastewater treatment plant classification
outlier detection
title Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition
title_full Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition
title_fullStr Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition
title_full_unstemmed Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition
title_short Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition
title_sort wastewater based epidemiology deriving a sars cov 2 data validation method to assess data quality and to improve trend recognition
topic SARS-CoV-2
data plausibility
automated quality control
wastewater-based epidemiology
wastewater treatment plant classification
outlier detection
url https://www.frontiersin.org/articles/10.3389/fpubh.2024.1497100/full
work_keys_str_mv AT cristinajsaravia wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition
AT peterputz wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition
AT christianwurzbacher wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition
AT annauchaikina wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition
AT jorgedrewes wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition
AT ulrikebraun wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition
AT clausgerhardbannick wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition
AT nathanobermaier wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition