Data augmentation scheme for federated learning with non-IID data

To solve the problem that the model accuracy remains low when the data are not independent and identically distributed (non-IID) across different clients in federated learning, a privacy-preserving data augmentation scheme was proposed.Firstly, a data augmentation framework for federated learning sc...

Full description

Saved in:
Bibliographic Details
Main Authors: Lingtao TANG, Di WANG, Shengyun LIU
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2023-01-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023007/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To solve the problem that the model accuracy remains low when the data are not independent and identically distributed (non-IID) across different clients in federated learning, a privacy-preserving data augmentation scheme was proposed.Firstly, a data augmentation framework for federated learning scenarios was designed.All clients generated synthetic samples locally and shared them with each other, which eased the problem of client drift caused by the difference of clients’ data distributions.Secondly, based on generative adversarial network and differential privacy, a private sample generation algorithm was proposed.It helped clients to generate informative samples while preserving the privacy of clients’ local data.Finally, a differentially private label selection algorithm was proposed to ensure the labels of synthetic samples will not leak information.Simulation results demonstrate that under multiple non-IID data partition strategies, the proposed scheme can consistently improve the model accuracy and make the model converge faster.Compared with the benchmark approaches, the proposed scheme can achieve at least 25% accuracy improvement when each client has only one class of samples.
ISSN:1000-436X