Estimating Conditional Distributions with Neural Networks Using R Package deeptrafo

Contemporary empirical applications frequently require flexible regression models for complex response types and large tabular or non-tabular, including image or text, data. Classical regression models either break down under the computational load of processing such data or require additional manu...

Full description

Saved in:
Bibliographic Details
Main Authors: Lucas Kook, Philipp F. M. Baumann, Oliver Dürr, Beate Sick, David Rügamer
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2024-12-01
Series:Journal of Statistical Software
Online Access:https://www.jstatsoft.org/index.php/jss/article/view/5244
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846101696460095488
author Lucas Kook
Philipp F. M. Baumann
Oliver Dürr
Beate Sick
David Rügamer
author_facet Lucas Kook
Philipp F. M. Baumann
Oliver Dürr
Beate Sick
David Rügamer
author_sort Lucas Kook
collection DOAJ
description Contemporary empirical applications frequently require flexible regression models for complex response types and large tabular or non-tabular, including image or text, data. Classical regression models either break down under the computational load of processing such data or require additional manual feature extraction to make these problems tractable. Here, we present deeptrafo, a package for fitting flexible regression models for conditional distributions using a tensorflow back end with numerous additional processors, such as neural networks, penalties, and smoothing splines. Package deeptrafo implements deep conditional transformation models (DCTMs) for binary, ordinal, count, survival, continuous, and time series responses, potentially with uninformative censoring. Unlike other available methods, DCTMs do not assume a parametric family of distributions for the response. Further, the data analyst may trade off interpretability and flexibility by supplying custom neural network architectures and smoothers for each term in an intuitive formula interface. We demonstrate how to set up, fit, and work with DCTMs for several response types. We further showcase how to construct ensembles of these models, evaluate models using inbuilt cross-validation, and use other convenience functions for DCTMs in several applications. Lastly, we discuss DCTMs in light of other approaches to regression with non-tabular data.
format Article
id doaj-art-5fa5abff0f8d466d87ad9944cf93643a
institution Kabale University
issn 1548-7660
language English
publishDate 2024-12-01
publisher Foundation for Open Access Statistics
record_format Article
series Journal of Statistical Software
spelling doaj-art-5fa5abff0f8d466d87ad9944cf93643a2024-12-29T00:12:39ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602024-12-01111110.18637/jss.v111.i10Estimating Conditional Distributions with Neural Networks Using R Package deeptrafoLucas Kook0Philipp F. M. Baumann1Oliver Dürr2Beate Sick3David Rügamer4Vienna University of Economics and BusinessETH ZurichHTWG KonstanzUniversity of ZurichLMU Munich Contemporary empirical applications frequently require flexible regression models for complex response types and large tabular or non-tabular, including image or text, data. Classical regression models either break down under the computational load of processing such data or require additional manual feature extraction to make these problems tractable. Here, we present deeptrafo, a package for fitting flexible regression models for conditional distributions using a tensorflow back end with numerous additional processors, such as neural networks, penalties, and smoothing splines. Package deeptrafo implements deep conditional transformation models (DCTMs) for binary, ordinal, count, survival, continuous, and time series responses, potentially with uninformative censoring. Unlike other available methods, DCTMs do not assume a parametric family of distributions for the response. Further, the data analyst may trade off interpretability and flexibility by supplying custom neural network architectures and smoothers for each term in an intuitive formula interface. We demonstrate how to set up, fit, and work with DCTMs for several response types. We further showcase how to construct ensembles of these models, evaluate models using inbuilt cross-validation, and use other convenience functions for DCTMs in several applications. Lastly, we discuss DCTMs in light of other approaches to regression with non-tabular data. https://www.jstatsoft.org/index.php/jss/article/view/5244
spellingShingle Lucas Kook
Philipp F. M. Baumann
Oliver Dürr
Beate Sick
David Rügamer
Estimating Conditional Distributions with Neural Networks Using R Package deeptrafo
Journal of Statistical Software
title Estimating Conditional Distributions with Neural Networks Using R Package deeptrafo
title_full Estimating Conditional Distributions with Neural Networks Using R Package deeptrafo
title_fullStr Estimating Conditional Distributions with Neural Networks Using R Package deeptrafo
title_full_unstemmed Estimating Conditional Distributions with Neural Networks Using R Package deeptrafo
title_short Estimating Conditional Distributions with Neural Networks Using R Package deeptrafo
title_sort estimating conditional distributions with neural networks using r package deeptrafo
url https://www.jstatsoft.org/index.php/jss/article/view/5244
work_keys_str_mv AT lucaskook estimatingconditionaldistributionswithneuralnetworksusingrpackagedeeptrafo
AT philippfmbaumann estimatingconditionaldistributionswithneuralnetworksusingrpackagedeeptrafo
AT oliverdurr estimatingconditionaldistributionswithneuralnetworksusingrpackagedeeptrafo
AT beatesick estimatingconditionaldistributionswithneuralnetworksusingrpackagedeeptrafo
AT davidrugamer estimatingconditionaldistributionswithneuralnetworksusingrpackagedeeptrafo