EM-AUC: A Novel Algorithm for Evaluating Anomaly Based Network Intrusion Detection Systems
Effective network intrusion detection using anomaly scores from unsupervised machine learning models depends on the performance of the models. Although unsupervised models do not require labels during the training and testing phases, the assessment of their performance metrics during the evaluation...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/25/1/78 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841548897552433152 |
---|---|
author | Kevin Z. Bai John M. Fossaceca |
author_facet | Kevin Z. Bai John M. Fossaceca |
author_sort | Kevin Z. Bai |
collection | DOAJ |
description | Effective network intrusion detection using anomaly scores from unsupervised machine learning models depends on the performance of the models. Although unsupervised models do not require labels during the training and testing phases, the assessment of their performance metrics during the evaluation phase still requires comparing anomaly scores against labels. In real-world scenarios, the absence of labels in massive network datasets makes it infeasible to calculate performance metrics. Therefore, it is valuable to develop an algorithm that calculates robust performance metrics without using labels. In this paper, we propose a novel algorithm, Expectation Maximization-Area Under the Curve (EM-AUC), to derive the Area Under the ROC Curve (AUC-ROC) and the Area Under the Precision-Recall Curve (AUC-PR) by treating the unavailable labels as missing data and replacing them through their posterior probabilities. This algorithm was applied to two network intrusion datasets, yielding robust results. To the best of our knowledge, this is the first time AUC-ROC and AUC-PR, derived without labels, have been used to evaluate network intrusion detection systems. The EM-AUC algorithm enables model training, testing, and performance evaluation to proceed without comprehensive labels, offering a cost-effective and scalable solution for selecting the most effective models for network intrusion detection. |
format | Article |
id | doaj-art-141e22ffced24c83806f3872ea6cd58c |
institution | Kabale University |
issn | 1424-8220 |
language | English |
publishDate | 2024-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj-art-141e22ffced24c83806f3872ea6cd58c2025-01-10T13:20:48ZengMDPI AGSensors1424-82202024-12-012517810.3390/s25010078EM-AUC: A Novel Algorithm for Evaluating Anomaly Based Network Intrusion Detection SystemsKevin Z. Bai0John M. Fossaceca1Independent Researcher, Westwood, MA 02090, USADepartment of Engineering Management and Systems Engineering, George Washington University, Washington, DC 20052, USAEffective network intrusion detection using anomaly scores from unsupervised machine learning models depends on the performance of the models. Although unsupervised models do not require labels during the training and testing phases, the assessment of their performance metrics during the evaluation phase still requires comparing anomaly scores against labels. In real-world scenarios, the absence of labels in massive network datasets makes it infeasible to calculate performance metrics. Therefore, it is valuable to develop an algorithm that calculates robust performance metrics without using labels. In this paper, we propose a novel algorithm, Expectation Maximization-Area Under the Curve (EM-AUC), to derive the Area Under the ROC Curve (AUC-ROC) and the Area Under the Precision-Recall Curve (AUC-PR) by treating the unavailable labels as missing data and replacing them through their posterior probabilities. This algorithm was applied to two network intrusion datasets, yielding robust results. To the best of our knowledge, this is the first time AUC-ROC and AUC-PR, derived without labels, have been used to evaluate network intrusion detection systems. The EM-AUC algorithm enables model training, testing, and performance evaluation to proceed without comprehensive labels, offering a cost-effective and scalable solution for selecting the most effective models for network intrusion detection.https://www.mdpi.com/1424-8220/25/1/78network intrusion detectionunsupervised machine learning modelsEM-AUC algorithmmissing data inferenceArea Under the Roc CurveArea Under the Precision-Recall Curve |
spellingShingle | Kevin Z. Bai John M. Fossaceca EM-AUC: A Novel Algorithm for Evaluating Anomaly Based Network Intrusion Detection Systems Sensors network intrusion detection unsupervised machine learning models EM-AUC algorithm missing data inference Area Under the Roc Curve Area Under the Precision-Recall Curve |
title | EM-AUC: A Novel Algorithm for Evaluating Anomaly Based Network Intrusion Detection Systems |
title_full | EM-AUC: A Novel Algorithm for Evaluating Anomaly Based Network Intrusion Detection Systems |
title_fullStr | EM-AUC: A Novel Algorithm for Evaluating Anomaly Based Network Intrusion Detection Systems |
title_full_unstemmed | EM-AUC: A Novel Algorithm for Evaluating Anomaly Based Network Intrusion Detection Systems |
title_short | EM-AUC: A Novel Algorithm for Evaluating Anomaly Based Network Intrusion Detection Systems |
title_sort | em auc a novel algorithm for evaluating anomaly based network intrusion detection systems |
topic | network intrusion detection unsupervised machine learning models EM-AUC algorithm missing data inference Area Under the Roc Curve Area Under the Precision-Recall Curve |
url | https://www.mdpi.com/1424-8220/25/1/78 |
work_keys_str_mv | AT kevinzbai emaucanovelalgorithmforevaluatinganomalybasednetworkintrusiondetectionsystems AT johnmfossaceca emaucanovelalgorithmforevaluatinganomalybasednetworkintrusiondetectionsystems |