Generating User Privacy-Controllable Synthetic Data for Recommendation Systems

Recommender systems are widely used in e-commerce, news, and advertising, providing personalized recommendations by analyzing user interaction history. However, during large-scale data analysis and sharing, user privacy faces the risk of exposure, especially for users who wish to remain anonymous. W...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhenxiang He, Ke Chen, Zhenyu Zhao
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10820329/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841536149778071552
author Zhenxiang He
Ke Chen
Zhenyu Zhao
author_facet Zhenxiang He
Ke Chen
Zhenyu Zhao
author_sort Zhenxiang He
collection DOAJ
description Recommender systems are widely used in e-commerce, news, and advertising, providing personalized recommendations by analyzing user interaction history. However, during large-scale data analysis and sharing, user privacy faces the risk of exposure, especially for users who wish to remain anonymous. While existing synthetic data methods perform well in privacy protection, there is still room for improvement in balancing personalized privacy protection and data utility, particularly in sparse data environments. To address this issue, this paper proposes a user-privacy-controllable data augmentation model that generates synthetic substitutes to replace original data, ensuring anonymity. The model mitigates the data sparsity problem by introducing a global average user-item vector and an attention mechanism. The global vector captures the overall user interest trend, reducing noise and outliers, which allows for a more accurate representation of user interests and a precise selection of items to replace, even in sparse data environments. Additionally, the model dynamically generates synthetic substitutes that align with user interests based on privacy preferences, prioritizing replacing low-interest items, thus achieving a good balance between privacy protection and data utility. Experimental results on three public datasets show that the model effectively protects privacy while maintaining high data utility, outperforming existing mainstream methods in personalized privacy protection.
format Article
id doaj-art-04967c67d49a4e73b69c0b26f67e674c
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-04967c67d49a4e73b69c0b26f67e674c2025-01-15T00:02:22ZengIEEEIEEE Access2169-35362025-01-01136643665510.1109/ACCESS.2025.352551310820329Generating User Privacy-Controllable Synthetic Data for Recommendation SystemsZhenxiang He0https://orcid.org/0009-0004-8441-9534Ke Chen1https://orcid.org/0009-0002-3305-0701Zhenyu Zhao2School of Cyber Security, Gansu University of Political Science and Law, Lanzhou, ChinaSchool of Cyber Security, Gansu University of Political Science and Law, Lanzhou, ChinaSchool of Cyber Security, Gansu University of Political Science and Law, Lanzhou, ChinaRecommender systems are widely used in e-commerce, news, and advertising, providing personalized recommendations by analyzing user interaction history. However, during large-scale data analysis and sharing, user privacy faces the risk of exposure, especially for users who wish to remain anonymous. While existing synthetic data methods perform well in privacy protection, there is still room for improvement in balancing personalized privacy protection and data utility, particularly in sparse data environments. To address this issue, this paper proposes a user-privacy-controllable data augmentation model that generates synthetic substitutes to replace original data, ensuring anonymity. The model mitigates the data sparsity problem by introducing a global average user-item vector and an attention mechanism. The global vector captures the overall user interest trend, reducing noise and outliers, which allows for a more accurate representation of user interests and a precise selection of items to replace, even in sparse data environments. Additionally, the model dynamically generates synthetic substitutes that align with user interests based on privacy preferences, prioritizing replacing low-interest items, thus achieving a good balance between privacy protection and data utility. Experimental results on three public datasets show that the model effectively protects privacy while maintaining high data utility, outperforming existing mainstream methods in personalized privacy protection.https://ieeexplore.ieee.org/document/10820329/Recommendation systemuser privacysynthetic data
spellingShingle Zhenxiang He
Ke Chen
Zhenyu Zhao
Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
IEEE Access
Recommendation system
user privacy
synthetic data
title Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_full Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_fullStr Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_full_unstemmed Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_short Generating User Privacy-Controllable Synthetic Data for Recommendation Systems
title_sort generating user privacy controllable synthetic data for recommendation systems
topic Recommendation system
user privacy
synthetic data
url https://ieeexplore.ieee.org/document/10820329/
work_keys_str_mv AT zhenxianghe generatinguserprivacycontrollablesyntheticdataforrecommendationsystems
AT kechen generatinguserprivacycontrollablesyntheticdataforrecommendationsystems
AT zhenyuzhao generatinguserprivacycontrollablesyntheticdataforrecommendationsystems