Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts

Preserving authorship anonymity is paramount to protect activists, freedom of expression, and critical journalism. Although there are several mechanisms to provide anonymity on the Internet, one can still identify anonymous authors through their writing style. With the advances in neural network and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Antônio Marcos Rodrigues Franco, Ítalo Cunha, Leonardo B. Oliveira
Format:	Article
Language:	English
Published:	Elsevier 2024-12-01
Series:	Natural Language Processing Journal
Subjects:	Authorship obfuscation Privacy Natural language processing Artificial intelligence
Online Access:	http://www.sciencedirect.com/science/article/pii/S2949719124000554
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846123181315719168
author	Antônio Marcos Rodrigues Franco Ítalo Cunha Leonardo B. Oliveira
author_facet	Antônio Marcos Rodrigues Franco Ítalo Cunha Leonardo B. Oliveira
author_sort	Antônio Marcos Rodrigues Franco
collection	DOAJ
description	Preserving authorship anonymity is paramount to protect activists, freedom of expression, and critical journalism. Although there are several mechanisms to provide anonymity on the Internet, one can still identify anonymous authors through their writing style. With the advances in neural network and natural language processing research, the success of a classifier when identifying the author of a text is growing. On the other hand, new approaches that use recurrent neural networks for automatic generation of obfuscated texts have also arisen to fight anonymity adversaries. In this work, we evaluate two approaches that use neural networks to generate obfuscated texts. The first approach uses Generative Adversarial Networks to train an encoder–decoder to transform sentences from an input style into a target style. The second one trains an auto encoder with Gradient Reversal Layer to learn invariant representations. In our experiments, we compared the efficiency of both techniques when removing the stylistic attributes of a text and preserving its original semantics. Our evaluation on real texts clarifies each technique’s trade-offs for Portuguese texts and provides guidance on practical deployment.
format	Article
id	doaj-art-3b563a19b0154d19821ce267ccf7a865
institution	Kabale University
issn	2949-7191
language	English
publishDate	2024-12-01
publisher	Elsevier
record_format	Article
series	Natural Language Processing Journal
spelling	doaj-art-3b563a19b0154d19821ce267ccf7a8652024-12-14T06:34:32ZengElsevierNatural Language Processing Journal2949-71912024-12-019100107Evaluation of deep neural network architectures for authorship obfuscation of Portuguese textsAntônio Marcos Rodrigues Franco0Ítalo Cunha1Leonardo B. Oliveira2Universidade Federal de Minas Gerais, Av. Presidente Antonio Carlos, 6627, Belo Horizonte, 31270010, Minas Gerais, BrazilCorresponding author.; Universidade Federal de Minas Gerais, Av. Presidente Antonio Carlos, 6627, Belo Horizonte, 31270010, Minas Gerais, BrazilUniversidade Federal de Minas Gerais, Av. Presidente Antonio Carlos, 6627, Belo Horizonte, 31270010, Minas Gerais, BrazilPreserving authorship anonymity is paramount to protect activists, freedom of expression, and critical journalism. Although there are several mechanisms to provide anonymity on the Internet, one can still identify anonymous authors through their writing style. With the advances in neural network and natural language processing research, the success of a classifier when identifying the author of a text is growing. On the other hand, new approaches that use recurrent neural networks for automatic generation of obfuscated texts have also arisen to fight anonymity adversaries. In this work, we evaluate two approaches that use neural networks to generate obfuscated texts. The first approach uses Generative Adversarial Networks to train an encoder–decoder to transform sentences from an input style into a target style. The second one trains an auto encoder with Gradient Reversal Layer to learn invariant representations. In our experiments, we compared the efficiency of both techniques when removing the stylistic attributes of a text and preserving its original semantics. Our evaluation on real texts clarifies each technique’s trade-offs for Portuguese texts and provides guidance on practical deployment.http://www.sciencedirect.com/science/article/pii/S2949719124000554Authorship obfuscationPrivacyNatural language processingArtificial intelligence
spellingShingle	Antônio Marcos Rodrigues Franco Ítalo Cunha Leonardo B. Oliveira Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts Natural Language Processing Journal Authorship obfuscation Privacy Natural language processing Artificial intelligence
title	Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts
title_full	Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts
title_fullStr	Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts
title_full_unstemmed	Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts
title_short	Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts
title_sort	evaluation of deep neural network architectures for authorship obfuscation of portuguese texts
topic	Authorship obfuscation Privacy Natural language processing Artificial intelligence
url	http://www.sciencedirect.com/science/article/pii/S2949719124000554
work_keys_str_mv	AT antoniomarcosrodriguesfranco evaluationofdeepneuralnetworkarchitecturesforauthorshipobfuscationofportuguesetexts AT italocunha evaluationofdeepneuralnetworkarchitecturesforauthorshipobfuscationofportuguesetexts AT leonardoboliveira evaluationofdeepneuralnetworkarchitecturesforauthorshipobfuscationofportuguesetexts

Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts

Similar Items