Privacy-preserving data compression scheme for k-anonymity model based on Huffman coding

The k-anonymity model is widely used as a data anonymization technique for privacy protection during the data release phase.However, with the advent of the big data era, the generation of vast amounts of data poses challenges to data storage.However, it is not feasible to expand the storage space in...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yue YU, Xianzheng LIN, Weihai LI, Nenghai YU
Format:	Article
Language:	English
Published:	POSTS&TELECOM PRESS Co., LTD 2023-08-01
Series:	网络与信息安全学报
Subjects:	k-anonymity model privacy preservation data compression storage Huffman coding
Online Access:	http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023054
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841529633579728896
author	Yue YU Xianzheng LIN Weihai LI Nenghai YU
author_facet	Yue YU Xianzheng LIN Weihai LI Nenghai YU
author_sort	Yue YU
collection	DOAJ
description	The k-anonymity model is widely used as a data anonymization technique for privacy protection during the data release phase.However, with the advent of the big data era, the generation of vast amounts of data poses challenges to data storage.However, it is not feasible to expand the storage space infinitely by hardware upgrade, since the cost of memory is high and the storage space is limited.For this reason, data compression techniques can reduce storage costs and communication overhead.In order to reduce the storage space of the data generated by using anonymization techniques in the data publishing phase, a compression scheme was proposed for the original data and anonymized data of the k-anonymity model.For the original data of the k-anonymity model, the difference between the original data and the anonymized data was calculated according to the set rules and the pre-defined generalization level.Huffman coding compression was applied to the difference data according to frequency characteristics.By storing the difference data, the original data can be obtained indirectly, thus reducing the storage space of the original data.For anonymized data of the k-anonymity model, the anonymized data usually have high repeatability according to the generalization rules of the model or the pre-defined generalization hierarchy relations.The larger the value of k, the more generalized and repeatable the anonymized data becomes.The design of Huffman coding compression was implemented for anonymous data to reduce storage space.The experimental results show that the proposed scheme can significantly reduce the original data and the anonymous data compression rate of the k-anonymity model.Across five models and variousk-value settings,the proposed scheme reduces the compression rate of raw and anonymized data by 72.2% and 64.2% on average compared to the Windows 11 zip tool.
format	Article
id	doaj-art-a2d30fc92a5e44deaa8995788fe57f7d
institution	Kabale University
issn	2096-109X
language	English
publishDate	2023-08-01
publisher	POSTS&TELECOM PRESS Co., LTD
record_format	Article
series	网络与信息安全学报
spelling	doaj-art-a2d30fc92a5e44deaa8995788fe57f7d2025-01-15T03:16:44ZengPOSTS&TELECOM PRESS Co., LTD网络与信息安全学报2096-109X2023-08-019647359579553Privacy-preserving data compression scheme for k-anonymity model based on Huffman codingYue YUXianzheng LINWeihai LINenghai YUThe k-anonymity model is widely used as a data anonymization technique for privacy protection during the data release phase.However, with the advent of the big data era, the generation of vast amounts of data poses challenges to data storage.However, it is not feasible to expand the storage space infinitely by hardware upgrade, since the cost of memory is high and the storage space is limited.For this reason, data compression techniques can reduce storage costs and communication overhead.In order to reduce the storage space of the data generated by using anonymization techniques in the data publishing phase, a compression scheme was proposed for the original data and anonymized data of the k-anonymity model.For the original data of the k-anonymity model, the difference between the original data and the anonymized data was calculated according to the set rules and the pre-defined generalization level.Huffman coding compression was applied to the difference data according to frequency characteristics.By storing the difference data, the original data can be obtained indirectly, thus reducing the storage space of the original data.For anonymized data of the k-anonymity model, the anonymized data usually have high repeatability according to the generalization rules of the model or the pre-defined generalization hierarchy relations.The larger the value of k, the more generalized and repeatable the anonymized data becomes.The design of Huffman coding compression was implemented for anonymous data to reduce storage space.The experimental results show that the proposed scheme can significantly reduce the original data and the anonymous data compression rate of the k-anonymity model.Across five models and variousk-value settings,the proposed scheme reduces the compression rate of raw and anonymized data by 72.2% and 64.2% on average compared to the Windows 11 zip tool.http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023054k-anonymity modelprivacy preservationdata compression storageHuffman coding
spellingShingle	Yue YU Xianzheng LIN Weihai LI Nenghai YU Privacy-preserving data compression scheme for k-anonymity model based on Huffman coding 网络与信息安全学报 k-anonymity model privacy preservation data compression storage Huffman coding
title	Privacy-preserving data compression scheme for k-anonymity model based on Huffman coding
title_full	Privacy-preserving data compression scheme for k-anonymity model based on Huffman coding
title_fullStr	Privacy-preserving data compression scheme for k-anonymity model based on Huffman coding
title_full_unstemmed	Privacy-preserving data compression scheme for k-anonymity model based on Huffman coding
title_short	Privacy-preserving data compression scheme for k-anonymity model based on Huffman coding
title_sort	privacy preserving data compression scheme for k anonymity model based on huffman coding
topic	k-anonymity model privacy preservation data compression storage Huffman coding
url	http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023054
work_keys_str_mv	AT yueyu privacypreservingdatacompressionschemeforkanonymitymodelbasedonhuffmancoding AT xianzhenglin privacypreservingdatacompressionschemeforkanonymitymodelbasedonhuffmancoding AT weihaili privacypreservingdatacompressionschemeforkanonymitymodelbasedonhuffmancoding AT nenghaiyu privacypreservingdatacompressionschemeforkanonymitymodelbasedonhuffmancoding

Privacy-preserving data compression scheme for k-anonymity model based on Huffman coding

Similar Items