Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations

Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles,...

Full description

Saved in:
Bibliographic Details
Main Authors: Nicholas Tierney, Dianne Cook
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2023-02-01
Series:Journal of Statistical Software
Subjects:
Online Access:https://www.jstatsoft.org/index.php/jss/article/view/4108
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846101721681494016
author Nicholas Tierney
Dianne Cook
author_facet Nicholas Tierney
Dianne Cook
author_sort Nicholas Tierney
collection DOAJ
description Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles, with the goal of integrating missing value handling as a key part of data analysis workflows. We define a new data structure, and a suite of new operations. Together, these provide a connected framework for handling, exploring, and imputing missing values. These methods are available in the R package naniar.
format Article
id doaj-art-b51e13ad39a8410c90413e746452a1f7
institution Kabale University
issn 1548-7660
language English
publishDate 2023-02-01
publisher Foundation for Open Access Statistics
record_format Article
series Journal of Statistical Software
spelling doaj-art-b51e13ad39a8410c90413e746452a1f72024-12-29T00:12:52ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602023-02-01105110.18637/jss.v105.i073899Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of ImputationsNicholas Tierney0Dianne Cook1Monash University, Telethon Kids InstituteMonash University Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles, with the goal of integrating missing value handling as a key part of data analysis workflows. We define a new data structure, and a suite of new operations. Together, these provide a connected framework for handling, exploring, and imputing missing values. These methods are available in the R package naniar. https://www.jstatsoft.org/index.php/jss/article/view/4108statistical computingstatistical graphicsdata sciencedata visualizationtidyversedata pipeline
spellingShingle Nicholas Tierney
Dianne Cook
Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
Journal of Statistical Software
statistical computing
statistical graphics
data science
data visualization
tidyverse
data pipeline
title Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
title_full Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
title_fullStr Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
title_full_unstemmed Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
title_short Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
title_sort expanding tidy data principles to facilitate missing data exploration visualization and assessment of imputations
topic statistical computing
statistical graphics
data science
data visualization
tidyverse
data pipeline
url https://www.jstatsoft.org/index.php/jss/article/view/4108
work_keys_str_mv AT nicholastierney expandingtidydataprinciplestofacilitatemissingdataexplorationvisualizationandassessmentofimputations
AT diannecook expandingtidydataprinciplestofacilitatemissingdataexplorationvisualizationandassessmentofimputations