Internal validation of self-reported case numbers in hospital quality reports: preparing secondary data for health services research

Abstract Background Health services research often relies on secondary data, necessitating quality checks for completeness, validity, and potential errors before use. Various methods address implausible data, including data elimination, statistical estimation, or value substitution from the same or...

Full description

Saved in:
Bibliographic Details
Main Authors: Limei Ji, Max Geraedts, Werner de Cruppé
Format: Article
Language:English
Published: BMC 2024-12-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-024-02429-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841559332332765184
author Limei Ji
Max Geraedts
Werner de Cruppé
author_facet Limei Ji
Max Geraedts
Werner de Cruppé
author_sort Limei Ji
collection DOAJ
description Abstract Background Health services research often relies on secondary data, necessitating quality checks for completeness, validity, and potential errors before use. Various methods address implausible data, including data elimination, statistical estimation, or value substitution from the same or another dataset. This study presents an internal validation process of a secondary dataset used to investigate hospital compliance with minimum caseload requirements (MCR) in Germany. The secondary data source validated is the German Hospital Quality Reports (GHQR), an official dataset containing structured self-reported data from all hospitals in Germany. Methods This study conducted an internal cross-field validation of MCR-related data in GHQR from 2016 to 2021. The validation process checked the validity of reported MCR caseloads, including data availability and consistency, by comparing the stated MCR caseload with further variables in the GHQR. Subsequently, implausible MCR caseload values were corrected using the most plausible values given in the same GHQR. The study also analysed the error sources and used reimbursement-related Diagnosis Related Groups Statistic data to assess the validation outcomes. Results The analysis focused on four MCR procedures. 11.8–27.7% of the total MCR caseload values in the GHQR appeared ambiguous, and 7.9–23.7% were corrected. The correction added 0.7–3.7% of cases not previously stated as MCR caseloads and added 1.5–26.1% of hospital sites as MCR performing hospitals not previously stated in the GHQR. The main error source was this non-reporting of MCR caseloads, especially by hospitals with low case numbers. The basic plausibility control implemented by the Federal Joint Committee since 2018 has improved the MCR-related data quality over time. Conclusions This study employed a comprehensive approach to dataset internal validation that encompassed: (1) hospital association level data, (2) hospital site level data and (3) medical department level data, (4) report data spanning six years, and (5) logical plausibility checks. To ensure data completeness, we selected the most plausible values without eliminating incomplete or implausible data. For future practice, we recommend a validation process when using GHQR as a data source for MCR-related research. Additionally, an adapted plausibility control could help to improve the quality of MCR documentation.
format Article
id doaj-art-00b4f312b93b451c8de6f37fe91eba9d
institution Kabale University
issn 1471-2288
language English
publishDate 2024-12-01
publisher BMC
record_format Article
series BMC Medical Research Methodology
spelling doaj-art-00b4f312b93b451c8de6f37fe91eba9d2025-01-05T12:34:14ZengBMCBMC Medical Research Methodology1471-22882024-12-0124111610.1186/s12874-024-02429-6Internal validation of self-reported case numbers in hospital quality reports: preparing secondary data for health services researchLimei Ji0Max Geraedts1Werner de Cruppé2Institute for Health Services Research and Clinical Epidemiology, Philipps-Universität MarburgInstitute for Health Services Research and Clinical Epidemiology, Philipps-Universität MarburgInstitute for Health Services Research and Clinical Epidemiology, Philipps-Universität MarburgAbstract Background Health services research often relies on secondary data, necessitating quality checks for completeness, validity, and potential errors before use. Various methods address implausible data, including data elimination, statistical estimation, or value substitution from the same or another dataset. This study presents an internal validation process of a secondary dataset used to investigate hospital compliance with minimum caseload requirements (MCR) in Germany. The secondary data source validated is the German Hospital Quality Reports (GHQR), an official dataset containing structured self-reported data from all hospitals in Germany. Methods This study conducted an internal cross-field validation of MCR-related data in GHQR from 2016 to 2021. The validation process checked the validity of reported MCR caseloads, including data availability and consistency, by comparing the stated MCR caseload with further variables in the GHQR. Subsequently, implausible MCR caseload values were corrected using the most plausible values given in the same GHQR. The study also analysed the error sources and used reimbursement-related Diagnosis Related Groups Statistic data to assess the validation outcomes. Results The analysis focused on four MCR procedures. 11.8–27.7% of the total MCR caseload values in the GHQR appeared ambiguous, and 7.9–23.7% were corrected. The correction added 0.7–3.7% of cases not previously stated as MCR caseloads and added 1.5–26.1% of hospital sites as MCR performing hospitals not previously stated in the GHQR. The main error source was this non-reporting of MCR caseloads, especially by hospitals with low case numbers. The basic plausibility control implemented by the Federal Joint Committee since 2018 has improved the MCR-related data quality over time. Conclusions This study employed a comprehensive approach to dataset internal validation that encompassed: (1) hospital association level data, (2) hospital site level data and (3) medical department level data, (4) report data spanning six years, and (5) logical plausibility checks. To ensure data completeness, we selected the most plausible values without eliminating incomplete or implausible data. For future practice, we recommend a validation process when using GHQR as a data source for MCR-related research. Additionally, an adapted plausibility control could help to improve the quality of MCR documentation.https://doi.org/10.1186/s12874-024-02429-6German hospital quality reportMinimum case volume regulationsInternal data validationCross-field validationError source analysisSecondary data
spellingShingle Limei Ji
Max Geraedts
Werner de Cruppé
Internal validation of self-reported case numbers in hospital quality reports: preparing secondary data for health services research
BMC Medical Research Methodology
German hospital quality report
Minimum case volume regulations
Internal data validation
Cross-field validation
Error source analysis
Secondary data
title Internal validation of self-reported case numbers in hospital quality reports: preparing secondary data for health services research
title_full Internal validation of self-reported case numbers in hospital quality reports: preparing secondary data for health services research
title_fullStr Internal validation of self-reported case numbers in hospital quality reports: preparing secondary data for health services research
title_full_unstemmed Internal validation of self-reported case numbers in hospital quality reports: preparing secondary data for health services research
title_short Internal validation of self-reported case numbers in hospital quality reports: preparing secondary data for health services research
title_sort internal validation of self reported case numbers in hospital quality reports preparing secondary data for health services research
topic German hospital quality report
Minimum case volume regulations
Internal data validation
Cross-field validation
Error source analysis
Secondary data
url https://doi.org/10.1186/s12874-024-02429-6
work_keys_str_mv AT limeiji internalvalidationofselfreportedcasenumbersinhospitalqualityreportspreparingsecondarydataforhealthservicesresearch
AT maxgeraedts internalvalidationofselfreportedcasenumbersinhospitalqualityreportspreparingsecondarydataforhealthservicesresearch
AT wernerdecruppe internalvalidationofselfreportedcasenumbersinhospitalqualityreportspreparingsecondarydataforhealthservicesresearch