NLP neural network copyright protection based on black box watermark

With the rapid development of natural language processing techniques, the use of language models in text classification and sentiment analysis has been increasing.However, language models are susceptible to piracy and redistribution by adversaries, posing a serious threat to the intellectual propert...

Full description

Saved in:

Bibliographic Details
Main Authors:	Long DAI, Jing ZHANG, Xuefeng FAN, Xiaoyi ZHOU
Format:	Article
Language:	English
Published:	POSTS&TELECOM PRESS Co., LTD 2023-02-01
Series:	网络与信息安全学报
Subjects:	natural language processing text classification copyright protection language model black box watermarking
Online Access:	http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023009
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841529595292024832
author	Long DAI Jing ZHANG Xuefeng FAN Xiaoyi ZHOU
author_facet	Long DAI Jing ZHANG Xuefeng FAN Xiaoyi ZHOU
author_sort	Long DAI
collection	DOAJ
description	With the rapid development of natural language processing techniques, the use of language models in text classification and sentiment analysis has been increasing.However, language models are susceptible to piracy and redistribution by adversaries, posing a serious threat to the intellectual property of model owners.Therefore, researchers have been working on designing protection mechanisms to identify the copyright information of language models.However, existing watermarking of language models for text classification tasks cannot be associated with the owner’s identity, and they are not robust enough and cannot regenerate trigger sets.To solve these problems, a new model, namely black-box watermarking scheme for text classification tasks, was proposed.It was a scheme that can remotely and quickly verify model ownership.The copyright message and the key of the model owner were obtained through the Hash-based Message Authentication Code (HMAC), and the message digest obtained by HMAC can prevent forgery and had high security.A certain amount of text data was randomly selected from each category of the original training set and the digest was combined with the text data to construct the trigger set, then the watermark was embedded on the language model during the training process.To evaluate the performance of the proposed scheme, watermarks were embedded on three common language models on the IMDB’s movie reviews and CNews text classification datasets.The experimental results show that the accuracy of the proposed watermarking verification scheme can reach 100% without affecting the original model.Even under common attacks such as model fine-tuning and pruning, the proposed watermarking scheme shows strong robustness and resistance to forgery attacks.Meanwhile, the embedding of the watermark does not affect the convergence time of the model and has high embedding efficiency.
format	Article
id	doaj-art-2c9d1707a8a54ec6992d75c02988e71a
institution	Kabale University
issn	2096-109X
language	English
publishDate	2023-02-01
publisher	POSTS&TELECOM PRESS Co., LTD
record_format	Article
series	网络与信息安全学报
spelling	doaj-art-2c9d1707a8a54ec6992d75c02988e71a2025-01-15T03:16:31ZengPOSTS&TELECOM PRESS Co., LTD网络与信息安全学报2096-109X2023-02-01914014959577416NLP neural network copyright protection based on black box watermarkLong DAIJing ZHANGXuefeng FANXiaoyi ZHOUWith the rapid development of natural language processing techniques, the use of language models in text classification and sentiment analysis has been increasing.However, language models are susceptible to piracy and redistribution by adversaries, posing a serious threat to the intellectual property of model owners.Therefore, researchers have been working on designing protection mechanisms to identify the copyright information of language models.However, existing watermarking of language models for text classification tasks cannot be associated with the owner’s identity, and they are not robust enough and cannot regenerate trigger sets.To solve these problems, a new model, namely black-box watermarking scheme for text classification tasks, was proposed.It was a scheme that can remotely and quickly verify model ownership.The copyright message and the key of the model owner were obtained through the Hash-based Message Authentication Code (HMAC), and the message digest obtained by HMAC can prevent forgery and had high security.A certain amount of text data was randomly selected from each category of the original training set and the digest was combined with the text data to construct the trigger set, then the watermark was embedded on the language model during the training process.To evaluate the performance of the proposed scheme, watermarks were embedded on three common language models on the IMDB’s movie reviews and CNews text classification datasets.The experimental results show that the accuracy of the proposed watermarking verification scheme can reach 100% without affecting the original model.Even under common attacks such as model fine-tuning and pruning, the proposed watermarking scheme shows strong robustness and resistance to forgery attacks.Meanwhile, the embedding of the watermark does not affect the convergence time of the model and has high embedding efficiency.http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023009natural language processingtext classificationcopyright protectionlanguage modelblack box watermarking
spellingShingle	Long DAI Jing ZHANG Xuefeng FAN Xiaoyi ZHOU NLP neural network copyright protection based on black box watermark 网络与信息安全学报 natural language processing text classification copyright protection language model black box watermarking
title	NLP neural network copyright protection based on black box watermark
title_full	NLP neural network copyright protection based on black box watermark
title_fullStr	NLP neural network copyright protection based on black box watermark
title_full_unstemmed	NLP neural network copyright protection based on black box watermark
title_short	NLP neural network copyright protection based on black box watermark
title_sort	nlp neural network copyright protection based on black box watermark
topic	natural language processing text classification copyright protection language model black box watermarking
url	http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023009
work_keys_str_mv	AT longdai nlpneuralnetworkcopyrightprotectionbasedonblackboxwatermark AT jingzhang nlpneuralnetworkcopyrightprotectionbasedonblackboxwatermark AT xuefengfan nlpneuralnetworkcopyrightprotectionbasedonblackboxwatermark AT xiaoyizhou nlpneuralnetworkcopyrightprotectionbasedonblackboxwatermark

NLP neural network copyright protection based on black box watermark

Similar Items