Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition
IntroductionAccurate and consistent data play a critical role in enabling health officials to make informed decisions regarding emerging trends in SARS-CoV-2 infections. Alongside traditional indicators such as the 7-day-incidence rate, wastewater-based epidemiology can provide valuable insights int...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2024-12-01
|
| Series: | Frontiers in Public Health |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fpubh.2024.1497100/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846126620686942208 |
|---|---|
| author | Cristina J. Saravia Peter Pütz Christian Wurzbacher Anna Uchaikina Jörg E. Drewes Ulrike Braun Claus Gerhard Bannick Nathan Obermaier |
| author_facet | Cristina J. Saravia Peter Pütz Christian Wurzbacher Anna Uchaikina Jörg E. Drewes Ulrike Braun Claus Gerhard Bannick Nathan Obermaier |
| author_sort | Cristina J. Saravia |
| collection | DOAJ |
| description | IntroductionAccurate and consistent data play a critical role in enabling health officials to make informed decisions regarding emerging trends in SARS-CoV-2 infections. Alongside traditional indicators such as the 7-day-incidence rate, wastewater-based epidemiology can provide valuable insights into SARS-CoV-2 concentration changes. However, the wastewater compositions and wastewater systems are rather complex. Multiple effects such as precipitation events or industrial discharges might affect the quantification of SARS-CoV-2 concentrations. Hence, analysing data from more than 150 wastewater treatment plants (WWTP) in Germany necessitates an automated and reliable method to evaluate data validity, identify potential extreme events, and, if possible, improve overall data quality.MethodsWe developed a method that first categorises the data quality of WWTPs and corresponding laboratories based on the number of outliers in the reproduction rate as well as the number of implausible inflection points within the SARS-CoV-2 time series. Subsequently, we scrutinised statistical outliers in several standard quality control parameters (QCP) that are routinely collected during the analysis process such as the flow rate, the electrical conductivity, or surrogate viruses like the pepper mild mottle virus. Furthermore, we investigated outliers in the ratio of the analysed gene segments that might indicate laboratory errors. To evaluate the success of our method, we measure the degree of accordance between identified QCP outliers and outliers in the SARS-CoV-2 concentration curves.Results and discussionOur analysis reveals that the flow and gene segment ratios are typically best at identifying outliers in the SARS-CoV-2 concentration curve albeit variations across WWTPs and laboratories. The exclusion of datapoints based on QCP plausibility checks predominantly improves data quality. Our derived data quality categories are in good accordance with visual assessments.ConclusionGood data quality is crucial for trend recognition, both on the WWTP level and when aggregating data from several WWTPs to regional or national trends. Our model can help to improve data quality in the context of health-related monitoring and can be optimised for each individual WWTP to account for the large diversity among WWTPs. |
| format | Article |
| id | doaj-art-05faa56bcca9401caeb3d8639440b844 |
| institution | Kabale University |
| issn | 2296-2565 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Public Health |
| spelling | doaj-art-05faa56bcca9401caeb3d8639440b8442024-12-12T13:47:17ZengFrontiers Media S.A.Frontiers in Public Health2296-25652024-12-011210.3389/fpubh.2024.14971001497100Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognitionCristina J. Saravia0Peter Pütz1Christian Wurzbacher2Anna Uchaikina3Jörg E. Drewes4Ulrike Braun5Claus Gerhard Bannick6Nathan Obermaier7Wastewater Technology Research, Wastewater Disposal, German Environment Agency, Berlin, GermanyInfectious Disease Epidemiology, Surveillance, Robert-Koch-Institute, Berlin, GermanyChair of Urban Water Systems Engineering, Technical University of Munich, Garching, GermanyChair of Urban Water Systems Engineering, Technical University of Munich, Garching, GermanyChair of Urban Water Systems Engineering, Technical University of Munich, Garching, GermanyWastewater Analysis, Monitoring Methods, German Environment Agency, Berlin, GermanyWastewater Technology Research, Wastewater Disposal, German Environment Agency, Berlin, GermanyWastewater Technology Research, Wastewater Disposal, German Environment Agency, Berlin, GermanyIntroductionAccurate and consistent data play a critical role in enabling health officials to make informed decisions regarding emerging trends in SARS-CoV-2 infections. Alongside traditional indicators such as the 7-day-incidence rate, wastewater-based epidemiology can provide valuable insights into SARS-CoV-2 concentration changes. However, the wastewater compositions and wastewater systems are rather complex. Multiple effects such as precipitation events or industrial discharges might affect the quantification of SARS-CoV-2 concentrations. Hence, analysing data from more than 150 wastewater treatment plants (WWTP) in Germany necessitates an automated and reliable method to evaluate data validity, identify potential extreme events, and, if possible, improve overall data quality.MethodsWe developed a method that first categorises the data quality of WWTPs and corresponding laboratories based on the number of outliers in the reproduction rate as well as the number of implausible inflection points within the SARS-CoV-2 time series. Subsequently, we scrutinised statistical outliers in several standard quality control parameters (QCP) that are routinely collected during the analysis process such as the flow rate, the electrical conductivity, or surrogate viruses like the pepper mild mottle virus. Furthermore, we investigated outliers in the ratio of the analysed gene segments that might indicate laboratory errors. To evaluate the success of our method, we measure the degree of accordance between identified QCP outliers and outliers in the SARS-CoV-2 concentration curves.Results and discussionOur analysis reveals that the flow and gene segment ratios are typically best at identifying outliers in the SARS-CoV-2 concentration curve albeit variations across WWTPs and laboratories. The exclusion of datapoints based on QCP plausibility checks predominantly improves data quality. Our derived data quality categories are in good accordance with visual assessments.ConclusionGood data quality is crucial for trend recognition, both on the WWTP level and when aggregating data from several WWTPs to regional or national trends. Our model can help to improve data quality in the context of health-related monitoring and can be optimised for each individual WWTP to account for the large diversity among WWTPs.https://www.frontiersin.org/articles/10.3389/fpubh.2024.1497100/fullSARS-CoV-2data plausibilityautomated quality controlwastewater-based epidemiologywastewater treatment plant classificationoutlier detection |
| spellingShingle | Cristina J. Saravia Peter Pütz Christian Wurzbacher Anna Uchaikina Jörg E. Drewes Ulrike Braun Claus Gerhard Bannick Nathan Obermaier Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition Frontiers in Public Health SARS-CoV-2 data plausibility automated quality control wastewater-based epidemiology wastewater treatment plant classification outlier detection |
| title | Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition |
| title_full | Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition |
| title_fullStr | Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition |
| title_full_unstemmed | Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition |
| title_short | Wastewater-based epidemiology: deriving a SARS-CoV-2 data validation method to assess data quality and to improve trend recognition |
| title_sort | wastewater based epidemiology deriving a sars cov 2 data validation method to assess data quality and to improve trend recognition |
| topic | SARS-CoV-2 data plausibility automated quality control wastewater-based epidemiology wastewater treatment plant classification outlier detection |
| url | https://www.frontiersin.org/articles/10.3389/fpubh.2024.1497100/full |
| work_keys_str_mv | AT cristinajsaravia wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition AT peterputz wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition AT christianwurzbacher wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition AT annauchaikina wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition AT jorgedrewes wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition AT ulrikebraun wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition AT clausgerhardbannick wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition AT nathanobermaier wastewaterbasedepidemiologyderivingasarscov2datavalidationmethodtoassessdataqualityandtoimprovetrendrecognition |