Data Augmentation-Based Enhancement for Efficient Network Traffic Classification

The necessity of Network traffic classification is becoming increasingly significant as users’ applications and devices become more diverse and prevalent. As encryption becomes the norm for security reasons, the traffic classification problem is not easily solved. In this work, we provide...

Full description

Saved in:
Bibliographic Details
Main Authors: Chang-Yui Shin, Yang-Seo Choi, Myung-Sup Kim
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10819380/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841542545195139072
author Chang-Yui Shin
Yang-Seo Choi
Myung-Sup Kim
author_facet Chang-Yui Shin
Yang-Seo Choi
Myung-Sup Kim
author_sort Chang-Yui Shin
collection DOAJ
description The necessity of Network traffic classification is becoming increasingly significant as users’ applications and devices become more diverse and prevalent. As encryption becomes the norm for security reasons, the traffic classification problem is not easily solved. In this work, we provide an inductive counterevidence to the vague belief that deep learning models can perform well and outperform tree-based machine learning models in all aspects across domains, especially in the network traffic classification domain. We address the problem of finding an efficient encrypted traffic classification method in resource-constrained situations in the network traffic classification domain by limiting the scope of our research. Using the first packet, we converted packet headers and encrypted partial payloads into tabular data through a standardized format. We used them as the same inputs for lightweight deep learning and tree-based machine learning models, analyzed their performance, and identified efficient models. Next, we improved the performance of the previously selected efficient traffic classifier through data augmentation methods. Augmentation was performed to a degree that did not significantly damage the original data distribution so that the augmented dataset’s class distribution did not interfere with model learning. We applied two fundamentally contrasting methods to augment traffic data, depending on whether the basis of data augmentation is individual data or the entire data. Data augmentation increased the accuracy of the machine learning model by 0.26%, which complemented the machine learning model’s performance in network traffic classification and made the machine learning model outperform the lightweight deep learning model.
format Article
id doaj-art-505e947e959f430896e59ee85d2d35ee
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-505e947e959f430896e59ee85d2d35ee2025-01-14T00:02:33ZengIEEEIEEE Access2169-35362025-01-01136006602810.1109/ACCESS.2024.352500010819380Data Augmentation-Based Enhancement for Efficient Network Traffic ClassificationChang-Yui Shin0https://orcid.org/0000-0002-8410-0177Yang-Seo Choi1Myung-Sup Kim2https://orcid.org/0000-0002-3809-2057C4ISR System Development Quality Team, Defense Agency for Technology and Quality, Daejeon, South KoreaDepartment of Cyber Security Research Division, Electronics and Telecommunications Research Institute, Daejeon, South KoreaDepartment of Computer Convergence Software, Korea University, Sejong, South KoreaThe necessity of Network traffic classification is becoming increasingly significant as users’ applications and devices become more diverse and prevalent. As encryption becomes the norm for security reasons, the traffic classification problem is not easily solved. In this work, we provide an inductive counterevidence to the vague belief that deep learning models can perform well and outperform tree-based machine learning models in all aspects across domains, especially in the network traffic classification domain. We address the problem of finding an efficient encrypted traffic classification method in resource-constrained situations in the network traffic classification domain by limiting the scope of our research. Using the first packet, we converted packet headers and encrypted partial payloads into tabular data through a standardized format. We used them as the same inputs for lightweight deep learning and tree-based machine learning models, analyzed their performance, and identified efficient models. Next, we improved the performance of the previously selected efficient traffic classifier through data augmentation methods. Augmentation was performed to a degree that did not significantly damage the original data distribution so that the augmented dataset’s class distribution did not interfere with model learning. We applied two fundamentally contrasting methods to augment traffic data, depending on whether the basis of data augmentation is individual data or the entire data. Data augmentation increased the accuracy of the machine learning model by 0.26%, which complemented the machine learning model’s performance in network traffic classification and made the machine learning model outperform the lightweight deep learning model.https://ieeexplore.ieee.org/document/10819380/Network traffic classificationdata augmentationgenerative adversarial networkstabularrobustness
spellingShingle Chang-Yui Shin
Yang-Seo Choi
Myung-Sup Kim
Data Augmentation-Based Enhancement for Efficient Network Traffic Classification
IEEE Access
Network traffic classification
data augmentation
generative adversarial networks
tabular
robustness
title Data Augmentation-Based Enhancement for Efficient Network Traffic Classification
title_full Data Augmentation-Based Enhancement for Efficient Network Traffic Classification
title_fullStr Data Augmentation-Based Enhancement for Efficient Network Traffic Classification
title_full_unstemmed Data Augmentation-Based Enhancement for Efficient Network Traffic Classification
title_short Data Augmentation-Based Enhancement for Efficient Network Traffic Classification
title_sort data augmentation based enhancement for efficient network traffic classification
topic Network traffic classification
data augmentation
generative adversarial networks
tabular
robustness
url https://ieeexplore.ieee.org/document/10819380/
work_keys_str_mv AT changyuishin dataaugmentationbasedenhancementforefficientnetworktrafficclassification
AT yangseochoi dataaugmentationbasedenhancementforefficientnetworktrafficclassification
AT myungsupkim dataaugmentationbasedenhancementforefficientnetworktrafficclassification