Data Augmentation-Based Enhancement for Efficient Network Traffic Classification

The necessity of Network traffic classification is becoming increasingly significant as users’ applications and devices become more diverse and prevalent. As encryption becomes the norm for security reasons, the traffic classification problem is not easily solved. In this work, we provide...

Full description

Saved in:
Bibliographic Details
Main Authors: Chang-Yui Shin, Yang-Seo Choi, Myung-Sup Kim
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10819380/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The necessity of Network traffic classification is becoming increasingly significant as users’ applications and devices become more diverse and prevalent. As encryption becomes the norm for security reasons, the traffic classification problem is not easily solved. In this work, we provide an inductive counterevidence to the vague belief that deep learning models can perform well and outperform tree-based machine learning models in all aspects across domains, especially in the network traffic classification domain. We address the problem of finding an efficient encrypted traffic classification method in resource-constrained situations in the network traffic classification domain by limiting the scope of our research. Using the first packet, we converted packet headers and encrypted partial payloads into tabular data through a standardized format. We used them as the same inputs for lightweight deep learning and tree-based machine learning models, analyzed their performance, and identified efficient models. Next, we improved the performance of the previously selected efficient traffic classifier through data augmentation methods. Augmentation was performed to a degree that did not significantly damage the original data distribution so that the augmented dataset’s class distribution did not interfere with model learning. We applied two fundamentally contrasting methods to augment traffic data, depending on whether the basis of data augmentation is individual data or the entire data. Data augmentation increased the accuracy of the machine learning model by 0.26%, which complemented the machine learning model’s performance in network traffic classification and made the machine learning model outperform the lightweight deep learning model.
ISSN:2169-3536