Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles,...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Foundation for Open Access Statistics
2023-02-01
|
Series: | Journal of Statistical Software |
Subjects: | |
Online Access: | https://www.jstatsoft.org/index.php/jss/article/view/4108 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1846101721681494016 |
---|---|
author | Nicholas Tierney Dianne Cook |
author_facet | Nicholas Tierney Dianne Cook |
author_sort | Nicholas Tierney |
collection | DOAJ |
description |
Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles, with the goal of integrating missing value handling as a key part of data analysis workflows. We define a new data structure, and a suite of new operations. Together, these provide a connected framework for handling, exploring, and imputing missing values. These methods are available in the R package naniar.
|
format | Article |
id | doaj-art-b51e13ad39a8410c90413e746452a1f7 |
institution | Kabale University |
issn | 1548-7660 |
language | English |
publishDate | 2023-02-01 |
publisher | Foundation for Open Access Statistics |
record_format | Article |
series | Journal of Statistical Software |
spelling | doaj-art-b51e13ad39a8410c90413e746452a1f72024-12-29T00:12:52ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602023-02-01105110.18637/jss.v105.i073899Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of ImputationsNicholas Tierney0Dianne Cook1Monash University, Telethon Kids InstituteMonash University Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles, with the goal of integrating missing value handling as a key part of data analysis workflows. We define a new data structure, and a suite of new operations. Together, these provide a connected framework for handling, exploring, and imputing missing values. These methods are available in the R package naniar. https://www.jstatsoft.org/index.php/jss/article/view/4108statistical computingstatistical graphicsdata sciencedata visualizationtidyversedata pipeline |
spellingShingle | Nicholas Tierney Dianne Cook Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations Journal of Statistical Software statistical computing statistical graphics data science data visualization tidyverse data pipeline |
title | Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations |
title_full | Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations |
title_fullStr | Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations |
title_full_unstemmed | Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations |
title_short | Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations |
title_sort | expanding tidy data principles to facilitate missing data exploration visualization and assessment of imputations |
topic | statistical computing statistical graphics data science data visualization tidyverse data pipeline |
url | https://www.jstatsoft.org/index.php/jss/article/view/4108 |
work_keys_str_mv | AT nicholastierney expandingtidydataprinciplestofacilitatemissingdataexplorationvisualizationandassessmentofimputations AT diannecook expandingtidydataprinciplestofacilitatemissingdataexplorationvisualizationandassessmentofimputations |