CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines

Abstract The modern approach for network traffic classification (TC), which is an important part of operating and securing networks, is to use machine learning (ML) models that are able to learn intricate relationships between traffic characteristics and communicating applications. A crucial prerequ...

Full description

Saved in:
Bibliographic Details
Main Authors: Karel Hynek, Jan Luxemburk, Jaroslav Pešek, Tomáš Čejka, Pavel Šiška
Format: Article
Language:English
Published: Nature Portfolio 2024-10-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-024-03927-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846172221943316480
author Karel Hynek
Jan Luxemburk
Jaroslav Pešek
Tomáš Čejka
Pavel Šiška
author_facet Karel Hynek
Jan Luxemburk
Jaroslav Pešek
Tomáš Čejka
Pavel Šiška
author_sort Karel Hynek
collection DOAJ
description Abstract The modern approach for network traffic classification (TC), which is an important part of operating and securing networks, is to use machine learning (ML) models that are able to learn intricate relationships between traffic characteristics and communicating applications. A crucial prerequisite is having representative datasets. However, datasets collected from real production networks are not being published in sufficient numbers. Thus, this paper presents a novel dataset, CESNET-TLS-Year22, that captures the evolution of TLS traffic in an ISP network over a year. The dataset contains 180 web service labels and standard TC features, such as packet sequences. The unique year-long time span enables comprehensive evaluation of TC models and assessment of their robustness in the face of the ever-changing environment of production networks.
format Article
id doaj-art-15ccc647721747d8a54b8a0463d24db0
institution Kabale University
issn 2052-4463
language English
publishDate 2024-10-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-15ccc647721747d8a54b8a0463d24db02024-11-10T12:06:18ZengNature PortfolioScientific Data2052-44632024-10-0111111010.1038/s41597-024-03927-4CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone linesKarel Hynek0Jan Luxemburk1Jaroslav Pešek2Tomáš Čejka3Pavel Šiška4CESNETCESNETCESNETCESNETCESNETAbstract The modern approach for network traffic classification (TC), which is an important part of operating and securing networks, is to use machine learning (ML) models that are able to learn intricate relationships between traffic characteristics and communicating applications. A crucial prerequisite is having representative datasets. However, datasets collected from real production networks are not being published in sufficient numbers. Thus, this paper presents a novel dataset, CESNET-TLS-Year22, that captures the evolution of TLS traffic in an ISP network over a year. The dataset contains 180 web service labels and standard TC features, such as packet sequences. The unique year-long time span enables comprehensive evaluation of TC models and assessment of their robustness in the face of the ever-changing environment of production networks.https://doi.org/10.1038/s41597-024-03927-4
spellingShingle Karel Hynek
Jan Luxemburk
Jaroslav Pešek
Tomáš Čejka
Pavel Šiška
CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines
Scientific Data
title CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines
title_full CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines
title_fullStr CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines
title_full_unstemmed CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines
title_short CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines
title_sort cesnet tls year22 a year spanning tls network traffic dataset from backbone lines
url https://doi.org/10.1038/s41597-024-03927-4
work_keys_str_mv AT karelhynek cesnettlsyear22ayearspanningtlsnetworktrafficdatasetfrombackbonelines
AT janluxemburk cesnettlsyear22ayearspanningtlsnetworktrafficdatasetfrombackbonelines
AT jaroslavpesek cesnettlsyear22ayearspanningtlsnetworktrafficdatasetfrombackbonelines
AT tomascejka cesnettlsyear22ayearspanningtlsnetworktrafficdatasetfrombackbonelines
AT pavelsiska cesnettlsyear22ayearspanningtlsnetworktrafficdatasetfrombackbonelines