Composite Tor traffic features extraction method of webpage in actual network flow based on SDN

Website fingerprinting (WF) methods for Tor webpage traffic are often based on the separated Tor traffic or even the separated Tor webpage traffic.However, distinguishing Tor traffic from the original traffic of the actual network and Tor webpage traffic from the Tor traffic costs amount of computat...

Full description

Saved in:
Bibliographic Details
Main Authors: Hongping YAN, Qiang ZHOU, Shihao WANG, Wang YAO, Liukun HE, Liangmin WANG
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2022-03-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2022056/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Website fingerprinting (WF) methods for Tor webpage traffic are often based on the separated Tor traffic or even the separated Tor webpage traffic.However, distinguishing Tor traffic from the original traffic of the actual network and Tor webpage traffic from the Tor traffic costs amount of computation, which is more difficult than the WF attack itself.According to the current architecture of the Internet and the characteristics of network traffic converging to regional central nodes, the bi-directional statistical feature (BSF) was proposed for distinguishing Tor traffic through the intra-domain global perspective provided by the SDN structure of the central node and the node information disclosed by the Tor network.Furthermore, a hidden feature extraction method for Web traffic based on lifted structure fingerprinting (LSF) was proposed, and a composited Tor-webpage-identification traffic feature (CTTF) was proposed based on BSF and LSF deep features.For solving the problem of traffic training data scarcity, a traffic data augmentation method based on translation was proposed, which made the augmented traffic data as consistent as the Tor traffic data captured in the real working environment.The experimental results show that the identification rate based on CTTF can be improved by about 4% compared with using only the original data features.When there is less training data, the classification accuracy is improved more obvious after using the traffic data augmentation method, and the false positive rate can be effectively reduced.
ISSN:1000-436X