Lip forgery detection via spatial-frequency domain combination

In recent years, numerous “face-swapping” videos have emerged in social networks, one of the representatives is the lip forgery with speakers.While making life more entertaining for the public, it poses a significant crisis for personal privacy and property security in cyberspace.Currently, under no...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jiaying LIN, Wenbo ZHOU, Weiming ZHANG, Nenghai YU
Format:	Article
Language:	English
Published:	POSTS&TELECOM PRESS Co., LTD 2022-12-01
Series:	网络与信息安全学报
Subjects:	DeepFake forgery DeepFake detection and defense lip forgery detection anti-compression deep learning
Online Access:	http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2022075
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841529705450176512
author	Jiaying LIN Wenbo ZHOU Weiming ZHANG Nenghai YU
author_facet	Jiaying LIN Wenbo ZHOU Weiming ZHANG Nenghai YU
author_sort	Jiaying LIN
collection	DOAJ
description	In recent years, numerous “face-swapping” videos have emerged in social networks, one of the representatives is the lip forgery with speakers.While making life more entertaining for the public, it poses a significant crisis for personal privacy and property security in cyberspace.Currently, under non-destructive conditions, most of the lip forgery detection methods achieve good performance.However, the compression operations are widely used in practice especially in social media platforms, face recognition and other scenarios.While saving pixel and time redundancy, the compression operations affect the video quality and destroy the coherent integrity of pixel-to-pixel and frame-to-frame in the spatial domain, and then the degradation of its detection performance and even misjudgment of the real video will be caused.When the information in the spatial domain cannot provide sufficiently effective features, the information in the frequency domain naturally becomes a priority research object because it can resist compression interference.Aiming at this problem, the advantages of frequency information in image structure and gradient feedback were analyzed.Then the lip forgery detection via spatial-frequency domain combination was proposed, which effectively utilized the corresponding characteristics of information in spatial and frequency domains.For lip features in the spatial domain, an adaptive extraction network and a light-weight attention module were designed.For frequency features in the frequency domain, separate extraction and fusion modules for different components were designed.Subsequently, by conducting a weighted fusion of lip features in spatial domain and frequency features in frequency domain, more texture information was preserved.In addition, fine-grained constraints were designed during the training to separate the inter-class distance of real and fake lip features while closing the intra-class distance.Experimental results show that, benefiting from the frequency information, the proposed method can enhance the detection accuracy under compression situation with certain transferability.On the other hand, in the ablation study conducted on the core modules, the results verify the effectiveness of the frequency component for anti-compression and the constraint of the dual loss function in training.
format	Article
id	doaj-art-6414f091540a4e76a2b5d0b421624c33
institution	Kabale University
issn	2096-109X
language	English
publishDate	2022-12-01
publisher	POSTS&TELECOM PRESS Co., LTD
record_format	Article
series	网络与信息安全学报
spelling	doaj-art-6414f091540a4e76a2b5d0b421624c332025-01-15T03:16:06ZengPOSTS&TELECOM PRESS Co., LTD网络与信息安全学报2096-109X2022-12-01814615559574740Lip forgery detection via spatial-frequency domain combinationJiaying LINWenbo ZHOUWeiming ZHANGNenghai YUIn recent years, numerous “face-swapping” videos have emerged in social networks, one of the representatives is the lip forgery with speakers.While making life more entertaining for the public, it poses a significant crisis for personal privacy and property security in cyberspace.Currently, under non-destructive conditions, most of the lip forgery detection methods achieve good performance.However, the compression operations are widely used in practice especially in social media platforms, face recognition and other scenarios.While saving pixel and time redundancy, the compression operations affect the video quality and destroy the coherent integrity of pixel-to-pixel and frame-to-frame in the spatial domain, and then the degradation of its detection performance and even misjudgment of the real video will be caused.When the information in the spatial domain cannot provide sufficiently effective features, the information in the frequency domain naturally becomes a priority research object because it can resist compression interference.Aiming at this problem, the advantages of frequency information in image structure and gradient feedback were analyzed.Then the lip forgery detection via spatial-frequency domain combination was proposed, which effectively utilized the corresponding characteristics of information in spatial and frequency domains.For lip features in the spatial domain, an adaptive extraction network and a light-weight attention module were designed.For frequency features in the frequency domain, separate extraction and fusion modules for different components were designed.Subsequently, by conducting a weighted fusion of lip features in spatial domain and frequency features in frequency domain, more texture information was preserved.In addition, fine-grained constraints were designed during the training to separate the inter-class distance of real and fake lip features while closing the intra-class distance.Experimental results show that, benefiting from the frequency information, the proposed method can enhance the detection accuracy under compression situation with certain transferability.On the other hand, in the ablation study conducted on the core modules, the results verify the effectiveness of the frequency component for anti-compression and the constraint of the dual loss function in training.http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2022075DeepFake forgeryDeepFake detection and defenselip forgery detectionanti-compressiondeep learning
spellingShingle	Jiaying LIN Wenbo ZHOU Weiming ZHANG Nenghai YU Lip forgery detection via spatial-frequency domain combination 网络与信息安全学报 DeepFake forgery DeepFake detection and defense lip forgery detection anti-compression deep learning
title	Lip forgery detection via spatial-frequency domain combination
title_full	Lip forgery detection via spatial-frequency domain combination
title_fullStr	Lip forgery detection via spatial-frequency domain combination
title_full_unstemmed	Lip forgery detection via spatial-frequency domain combination
title_short	Lip forgery detection via spatial-frequency domain combination
title_sort	lip forgery detection via spatial frequency domain combination
topic	DeepFake forgery DeepFake detection and defense lip forgery detection anti-compression deep learning
url	http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2022075
work_keys_str_mv	AT jiayinglin lipforgerydetectionviaspatialfrequencydomaincombination AT wenbozhou lipforgerydetectionviaspatialfrequencydomaincombination AT weimingzhang lipforgerydetectionviaspatialfrequencydomaincombination AT nenghaiyu lipforgerydetectionviaspatialfrequencydomaincombination

Lip forgery detection via spatial-frequency domain combination

Similar Items