k-means clustering method preserving differential privacy in MapReduce framework

Aiming at the problem that traditional privacy preserving methods were unable to deal with malign analysis with arbitrary background knowledge, a k -means algorithm preserving differential privacy in distributed environment was proposed. This algorithm was under the computing framework of MapReduce....

Full description

Saved in:
Bibliographic Details
Main Authors: Hong-cheng LI, Xiao-ping WU, Yan CHEN
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2016-02-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016038/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Aiming at the problem that traditional privacy preserving methods were unable to deal with malign analysis with arbitrary background knowledge, a k -means algorithm preserving differential privacy in distributed environment was proposed. This algorithm was under the computing framework of MapReduce. The host tasks were obligated to control the iterations of k -means. The Mapper tasks were appointed to compute the distances between all the records and cluster-ing centers and to mark the records with the clusters which the records belong. The Reducer tasks were appointed to compute the numbers of records which belong to the same clusters and the sums of attributes vectors, and to disturb the numbers and the sums with noises made by Laplace mecha ism, in order to achieve differential privacy preserving. Based on the combinatorial features of differential privacy, theoretically prove that this algorithm is able to fulfill -differentiallye private. The experimental results demonstrate that this method can remain available in the process of preserving privacy and improving efficiency.
ISSN:1000-436X