Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study
Abstract Background Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias....
Saved in:
Main Authors: | , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-11-01
|
Series: | BMC Medical Research Methodology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12874-024-02382-4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1846165143529979904 |
---|---|
author | Emily Kawabata Daniel Major-Smith Gemma L. Clayton Chin Yang Shapland Tim P. Morris Alice R. Carter Alba Fernández-Sanlés Maria Carolina Borges Kate Tilling Gareth J. Griffith Louise A. C. Millard George Davey Smith Deborah A. Lawlor Rachael A. Hughes |
author_facet | Emily Kawabata Daniel Major-Smith Gemma L. Clayton Chin Yang Shapland Tim P. Morris Alice R. Carter Alba Fernández-Sanlés Maria Carolina Borges Kate Tilling Gareth J. Griffith Louise A. C. Millard George Davey Smith Deborah A. Lawlor Rachael A. Hughes |
author_sort | Emily Kawabata |
collection | DOAJ |
description | Abstract Background Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias. Probabilistic bias analysis specifies a prior distribution for these parameters, explicitly incorporating available information and uncertainty about their true values. A Bayesian bias analysis combines the prior distribution with the data’s likelihood function whilst a Monte Carlo bias analysis samples the bias parameters directly from the prior distribution. No study has compared a Monte Carlo bias analysis to a Bayesian bias analysis in the context of MNAR missingness. Methods We illustrate an accessible probabilistic bias analysis using the Monte Carlo bias analysis approach and a well-known imputation method. We designed a simulation study based on a motivating example from the UK Biobank study, where a large proportion of the outcome was missing and missingness was suspected to be MNAR. We compared the performance of our Monte Carlo bias analysis to a principled Bayesian bias analysis, complete case analysis (CCA) and multiple imputation (MI) assuming missing at random. Results As expected, given the simulation study design, CCA and MI estimates were substantially biased, with 95% confidence interval coverages of 7–48%. Including auxiliary variables (i.e., variables not included in the substantive analysis that are predictive of missingness and the missing data) in MI’s imputation model amplified the bias due to assuming missing at random. With reasonably accurate and precise information about the bias parameter, the Monte Carlo bias analysis performed as well as the Bayesian bias analysis. However, when very limited information was provided about the bias parameter, only the Bayesian bias analysis was able to eliminate most of the bias due to MNAR whilst the Monte Carlo bias analysis performed no better than the CCA and MI. Conclusion The Monte Carlo bias analysis we describe is easy to implement in standard software and, in the setting we explored, is a viable alternative to a Bayesian bias analysis. We caution careful consideration of choice of auxiliary variables when applying imputation where data may be MNAR. |
format | Article |
id | doaj-art-bff0a04323cc4ded81d4e12595170e9c |
institution | Kabale University |
issn | 1471-2288 |
language | English |
publishDate | 2024-11-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Research Methodology |
spelling | doaj-art-bff0a04323cc4ded81d4e12595170e9c2024-11-17T12:34:06ZengBMCBMC Medical Research Methodology1471-22882024-11-0124111410.1186/s12874-024-02382-4Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation studyEmily Kawabata0Daniel Major-Smith1Gemma L. Clayton2Chin Yang Shapland3Tim P. Morris4Alice R. Carter5Alba Fernández-Sanlés6Maria Carolina Borges7Kate Tilling8Gareth J. Griffith9Louise A. C. Millard10George Davey Smith11Deborah A. Lawlor12Rachael A. Hughes13MRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Clinical Trials Unit at UCLMRC Integrative Epidemiology Unit, University of BristolMRC Unit for Lifelong Health and Ageing at University College LondonMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolAbstract Background Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias. Probabilistic bias analysis specifies a prior distribution for these parameters, explicitly incorporating available information and uncertainty about their true values. A Bayesian bias analysis combines the prior distribution with the data’s likelihood function whilst a Monte Carlo bias analysis samples the bias parameters directly from the prior distribution. No study has compared a Monte Carlo bias analysis to a Bayesian bias analysis in the context of MNAR missingness. Methods We illustrate an accessible probabilistic bias analysis using the Monte Carlo bias analysis approach and a well-known imputation method. We designed a simulation study based on a motivating example from the UK Biobank study, where a large proportion of the outcome was missing and missingness was suspected to be MNAR. We compared the performance of our Monte Carlo bias analysis to a principled Bayesian bias analysis, complete case analysis (CCA) and multiple imputation (MI) assuming missing at random. Results As expected, given the simulation study design, CCA and MI estimates were substantially biased, with 95% confidence interval coverages of 7–48%. Including auxiliary variables (i.e., variables not included in the substantive analysis that are predictive of missingness and the missing data) in MI’s imputation model amplified the bias due to assuming missing at random. With reasonably accurate and precise information about the bias parameter, the Monte Carlo bias analysis performed as well as the Bayesian bias analysis. However, when very limited information was provided about the bias parameter, only the Bayesian bias analysis was able to eliminate most of the bias due to MNAR whilst the Monte Carlo bias analysis performed no better than the CCA and MI. Conclusion The Monte Carlo bias analysis we describe is easy to implement in standard software and, in the setting we explored, is a viable alternative to a Bayesian bias analysis. We caution careful consideration of choice of auxiliary variables when applying imputation where data may be MNAR.https://doi.org/10.1186/s12874-024-02382-4Bayesian bias analysisInverse probability weightingMissing not at randomMonte Carlo bias analysisMultiple imputationProbabilistic bias analysis |
spellingShingle | Emily Kawabata Daniel Major-Smith Gemma L. Clayton Chin Yang Shapland Tim P. Morris Alice R. Carter Alba Fernández-Sanlés Maria Carolina Borges Kate Tilling Gareth J. Griffith Louise A. C. Millard George Davey Smith Deborah A. Lawlor Rachael A. Hughes Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study BMC Medical Research Methodology Bayesian bias analysis Inverse probability weighting Missing not at random Monte Carlo bias analysis Multiple imputation Probabilistic bias analysis |
title | Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study |
title_full | Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study |
title_fullStr | Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study |
title_full_unstemmed | Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study |
title_short | Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study |
title_sort | accounting for bias due to outcome data missing not at random comparison and illustration of two approaches to probabilistic bias analysis a simulation study |
topic | Bayesian bias analysis Inverse probability weighting Missing not at random Monte Carlo bias analysis Multiple imputation Probabilistic bias analysis |
url | https://doi.org/10.1186/s12874-024-02382-4 |
work_keys_str_mv | AT emilykawabata accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT danielmajorsmith accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT gemmalclayton accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT chinyangshapland accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT timpmorris accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT alicercarter accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT albafernandezsanles accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT mariacarolinaborges accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT katetilling accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT garethjgriffith accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT louiseacmillard accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT georgedaveysmith accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT deborahalawlor accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy AT rachaelahughes accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy |