Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study

Abstract Background Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias....

Full description

Saved in:
Bibliographic Details
Main Authors: Emily Kawabata, Daniel Major-Smith, Gemma L. Clayton, Chin Yang Shapland, Tim P. Morris, Alice R. Carter, Alba Fernández-Sanlés, Maria Carolina Borges, Kate Tilling, Gareth J. Griffith, Louise A. C. Millard, George Davey Smith, Deborah A. Lawlor, Rachael A. Hughes
Format: Article
Language:English
Published: BMC 2024-11-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-024-02382-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846165143529979904
author Emily Kawabata
Daniel Major-Smith
Gemma L. Clayton
Chin Yang Shapland
Tim P. Morris
Alice R. Carter
Alba Fernández-Sanlés
Maria Carolina Borges
Kate Tilling
Gareth J. Griffith
Louise A. C. Millard
George Davey Smith
Deborah A. Lawlor
Rachael A. Hughes
author_facet Emily Kawabata
Daniel Major-Smith
Gemma L. Clayton
Chin Yang Shapland
Tim P. Morris
Alice R. Carter
Alba Fernández-Sanlés
Maria Carolina Borges
Kate Tilling
Gareth J. Griffith
Louise A. C. Millard
George Davey Smith
Deborah A. Lawlor
Rachael A. Hughes
author_sort Emily Kawabata
collection DOAJ
description Abstract Background Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias. Probabilistic bias analysis specifies a prior distribution for these parameters, explicitly incorporating available information and uncertainty about their true values. A Bayesian bias analysis combines the prior distribution with the data’s likelihood function whilst a Monte Carlo bias analysis samples the bias parameters directly from the prior distribution. No study has compared a Monte Carlo bias analysis to a Bayesian bias analysis in the context of MNAR missingness. Methods We illustrate an accessible probabilistic bias analysis using the Monte Carlo bias analysis approach and a well-known imputation method. We designed a simulation study based on a motivating example from the UK Biobank study, where a large proportion of the outcome was missing and missingness was suspected to be MNAR. We compared the performance of our Monte Carlo bias analysis to a principled Bayesian bias analysis, complete case analysis (CCA) and multiple imputation (MI) assuming missing at random. Results As expected, given the simulation study design, CCA and MI estimates were substantially biased, with 95% confidence interval coverages of 7–48%. Including auxiliary variables (i.e., variables not included in the substantive analysis that are predictive of missingness and the missing data) in MI’s imputation model amplified the bias due to assuming missing at random. With reasonably accurate and precise information about the bias parameter, the Monte Carlo bias analysis performed as well as the Bayesian bias analysis. However, when very limited information was provided about the bias parameter, only the Bayesian bias analysis was able to eliminate most of the bias due to MNAR whilst the Monte Carlo bias analysis performed no better than the CCA and MI. Conclusion The Monte Carlo bias analysis we describe is easy to implement in standard software and, in the setting we explored, is a viable alternative to a Bayesian bias analysis. We caution careful consideration of choice of auxiliary variables when applying imputation where data may be MNAR.
format Article
id doaj-art-bff0a04323cc4ded81d4e12595170e9c
institution Kabale University
issn 1471-2288
language English
publishDate 2024-11-01
publisher BMC
record_format Article
series BMC Medical Research Methodology
spelling doaj-art-bff0a04323cc4ded81d4e12595170e9c2024-11-17T12:34:06ZengBMCBMC Medical Research Methodology1471-22882024-11-0124111410.1186/s12874-024-02382-4Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation studyEmily Kawabata0Daniel Major-Smith1Gemma L. Clayton2Chin Yang Shapland3Tim P. Morris4Alice R. Carter5Alba Fernández-Sanlés6Maria Carolina Borges7Kate Tilling8Gareth J. Griffith9Louise A. C. Millard10George Davey Smith11Deborah A. Lawlor12Rachael A. Hughes13MRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Clinical Trials Unit at UCLMRC Integrative Epidemiology Unit, University of BristolMRC Unit for Lifelong Health and Ageing at University College LondonMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolMRC Integrative Epidemiology Unit, University of BristolAbstract Background Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias. Probabilistic bias analysis specifies a prior distribution for these parameters, explicitly incorporating available information and uncertainty about their true values. A Bayesian bias analysis combines the prior distribution with the data’s likelihood function whilst a Monte Carlo bias analysis samples the bias parameters directly from the prior distribution. No study has compared a Monte Carlo bias analysis to a Bayesian bias analysis in the context of MNAR missingness. Methods We illustrate an accessible probabilistic bias analysis using the Monte Carlo bias analysis approach and a well-known imputation method. We designed a simulation study based on a motivating example from the UK Biobank study, where a large proportion of the outcome was missing and missingness was suspected to be MNAR. We compared the performance of our Monte Carlo bias analysis to a principled Bayesian bias analysis, complete case analysis (CCA) and multiple imputation (MI) assuming missing at random. Results As expected, given the simulation study design, CCA and MI estimates were substantially biased, with 95% confidence interval coverages of 7–48%. Including auxiliary variables (i.e., variables not included in the substantive analysis that are predictive of missingness and the missing data) in MI’s imputation model amplified the bias due to assuming missing at random. With reasonably accurate and precise information about the bias parameter, the Monte Carlo bias analysis performed as well as the Bayesian bias analysis. However, when very limited information was provided about the bias parameter, only the Bayesian bias analysis was able to eliminate most of the bias due to MNAR whilst the Monte Carlo bias analysis performed no better than the CCA and MI. Conclusion The Monte Carlo bias analysis we describe is easy to implement in standard software and, in the setting we explored, is a viable alternative to a Bayesian bias analysis. We caution careful consideration of choice of auxiliary variables when applying imputation where data may be MNAR.https://doi.org/10.1186/s12874-024-02382-4Bayesian bias analysisInverse probability weightingMissing not at randomMonte Carlo bias analysisMultiple imputationProbabilistic bias analysis
spellingShingle Emily Kawabata
Daniel Major-Smith
Gemma L. Clayton
Chin Yang Shapland
Tim P. Morris
Alice R. Carter
Alba Fernández-Sanlés
Maria Carolina Borges
Kate Tilling
Gareth J. Griffith
Louise A. C. Millard
George Davey Smith
Deborah A. Lawlor
Rachael A. Hughes
Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study
BMC Medical Research Methodology
Bayesian bias analysis
Inverse probability weighting
Missing not at random
Monte Carlo bias analysis
Multiple imputation
Probabilistic bias analysis
title Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study
title_full Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study
title_fullStr Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study
title_full_unstemmed Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study
title_short Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study
title_sort accounting for bias due to outcome data missing not at random comparison and illustration of two approaches to probabilistic bias analysis a simulation study
topic Bayesian bias analysis
Inverse probability weighting
Missing not at random
Monte Carlo bias analysis
Multiple imputation
Probabilistic bias analysis
url https://doi.org/10.1186/s12874-024-02382-4
work_keys_str_mv AT emilykawabata accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT danielmajorsmith accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT gemmalclayton accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT chinyangshapland accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT timpmorris accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT alicercarter accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT albafernandezsanles accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT mariacarolinaborges accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT katetilling accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT garethjgriffith accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT louiseacmillard accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT georgedaveysmith accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT deborahalawlor accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy
AT rachaelahughes accountingforbiasduetooutcomedatamissingnotatrandomcomparisonandillustrationoftwoapproachestoprobabilisticbiasanalysisasimulationstudy