<i>CURATE</i>: Scaling-Up Differentially Private Causal Graph Discovery

Causal graph discovery (CGD) is the process of estimating the underlying probabilistic graphical model that represents the joint distribution of features of a dataset. CGD algorithms are broadly classified into two categories: (i) constraint-based algorithms, where the outcome depends on conditional...

Full description

Saved in:
Bibliographic Details
Main Authors: Payel Bhattacharjee, Ravi Tandon
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/26/11/946
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Causal graph discovery (CGD) is the process of estimating the underlying probabilistic graphical model that represents the joint distribution of features of a dataset. CGD algorithms are broadly classified into two categories: (i) constraint-based algorithms, where the outcome depends on conditional independence (CI) tests, and (ii) score-based algorithms, where the outcome depends on optimized score function. Because sensitive features of observational data are prone to privacy leakage, differential privacy (DP) has been adopted to ensure user privacy in CGD. Adding the same amount of noise in this sequential-type estimation process affects the predictive performance of algorithms. Initial CI tests in constraint-based algorithms and later iterations of the optimization process of score-based algorithms are crucial; thus, they need to be more accurate and less noisy. Based on this key observation, we present <i>CURATE</i> (CaUsal gRaph AdapTivE privacy), a DP-CGD framework with adaptive privacy budgeting. In contrast to existing DP-CGD algorithms with uniform privacy budgeting across all iterations, <i>CURATE</i> allows for adaptive privacy budgeting by minimizing error probability (constraint-based), maximizing iterations of the optimization problem (score-based) while keeping the cumulative leakage bounded. To validate our framework, we present a comprehensive set of experiments on several datasets and show that <i>CURATE</i> achieves higher utility compared to existing DP-CGD algorithms with less privacy leakage.
ISSN:1099-4300