Generating User Privacy-Controllable Synthetic Data for Recommendation Systems

Recommender systems are widely used in e-commerce, news, and advertising, providing personalized recommendations by analyzing user interaction history. However, during large-scale data analysis and sharing, user privacy faces the risk of exposure, especially for users who wish to remain anonymous. W...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhenxiang He, Ke Chen, Zhenyu Zhao
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Recommendation system user privacy synthetic data
Online Access:	https://ieeexplore.ieee.org/document/10820329/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841536149778071552
author	Zhenxiang He Ke Chen Zhenyu Zhao
author_facet	Zhenxiang He Ke Chen Zhenyu Zhao
author_sort	Zhenxiang He
collection	DOAJ
description	Recommender systems are widely used in e-commerce, news, and advertising, providing personalized recommendations by analyzing user interaction history. However, during large-scale data analysis and sharing, user privacy faces the risk of exposure, especially for users who wish to remain anonymous. While existing synthetic data methods perform well in privacy protection, there is still room for improvement in balancing personalized privacy protection and data utility, particularly in sparse data environments. To address this issue, this paper proposes a user-privacy-controllable data augmentation model that generates synthetic substitutes to replace original data, ensuring anonymity. The model mitigates the data sparsity problem by introducing a global average user-item vector and an attention mechanism. The global vector captures the overall user interest trend, reducing noise and outliers, which allows for a more accurate representation of user interests and a precise selection of items to replace, even in sparse data environments. Additionally, the model dynamically generates synthetic substitutes that align with user interests based on privacy preferences, prioritizing replacing low-interest items, thus achieving a good balance between privacy protection and data utility. Experimental results on three public datasets show that the model effectively protects privacy while maintaining high data utility, outperforming existing mainstream methods in personalized privacy protection.
format	Article
id	doaj-art-04967c67d49a4e73b69c0b26f67e674c
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-04967c67d49a4e73b69c0b26f67e674c2025-01-15T00:02:22ZengIEEEIEEE Access2169-35362025-01-01136643665510.1109/ACCESS.2025.352551310820329Generating User Privacy-Controllable Synthetic Data for Recommendation SystemsZhenxiang He0https://orcid.org/0009-0004-8441-9534Ke Chen1https://orcid.org/0009-0002-3305-0701Zhenyu Zhao2School of Cyber Security, Gansu University of Political Science and Law, Lanzhou, ChinaSchool of Cyber Security, Gansu University of Political Science and Law, Lanzhou, ChinaSchool of Cyber Security, Gansu University of Political Science and Law, Lanzhou, ChinaRecommender systems are widely used in e-commerce, news, and advertising, providing personalized recommendations by analyzing user interaction history. However, during large-scale data analysis and sharing, user privacy faces the risk of exposure, especially for users who wish to remain anonymous. While existing synthetic data methods perform well in privacy protection, there is still room for improvement in balancing personalized privacy protection and data utility, particularly in sparse data environments. To address this issue, this paper proposes a user-privacy-controllable data augmentation model that generates synthetic substitutes to replace original data, ensuring anonymity. The model mitigates the data sparsity problem by introducing a global average user-item vector and an attention mechanism. The global vector captures the overall user interest trend, reducing noise and outliers, which allows for a more accurate representation of user interests and a precise selection of items to replace, even in sparse data environments. Additionally, the model dynamically generates synthetic substitutes that align with user interests based on privacy preferences, prioritizing replacing low-interest items, thus achieving a good balance between privacy protection and data utility. Experimental results on three public datasets show that the model effectively protects privacy while maintaining high data utility, outperforming existing mainstream methods in personalized privacy protection.https://ieeexplore.ieee.org/document/10820329/Recommendation systemuser privacysynthetic data
spellingShingle	Zhenxiang He Ke Chen Zhenyu Zhao Generating User Privacy-Controllable Synthetic Data for Recommendation Systems IEEE Access Recommendation system user privacy synthetic data
title	Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_full	Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_fullStr	Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_full_unstemmed	Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_short	Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_sort	generating user privacy controllable synthetic data for recommendation systems
topic	Recommendation system user privacy synthetic data
url	https://ieeexplore.ieee.org/document/10820329/
work_keys_str_mv	AT zhenxianghe generatinguserprivacycontrollablesyntheticdataforrecommendationsystems AT kechen generatinguserprivacycontrollablesyntheticdataforrecommendationsystems AT zhenyuzhao generatinguserprivacycontrollablesyntheticdataforrecommendationsystems

Generating User Privacy-Controllable Synthetic Data for Recommendation Systems

Similar Items