Optimising Insider Threat Prediction: Exploring BiLSTM Networks and Sequential Features

Abstract Insider threats pose a critical risk to organisations, impacting their data, processes, resources, and overall security. Such significant risks arise from individuals with authorised access and familiarity with internal systems, emphasising the potential for insider threats to compromise th...

Full description

Saved in:
Bibliographic Details
Main Authors: Phavithra Manoharan, Wei Hong, Jiao Yin, Hua Wang, Yanchun Zhang, Wenjie Ye
Format: Article
Language:English
Published: SpringerOpen 2024-11-01
Series:Data Science and Engineering
Subjects:
Online Access:https://doi.org/10.1007/s41019-024-00260-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Insider threats pose a critical risk to organisations, impacting their data, processes, resources, and overall security. Such significant risks arise from individuals with authorised access and familiarity with internal systems, emphasising the potential for insider threats to compromise the integrity of organisations. Previous research has addressed the challenge by pinpointing malicious actions that have already occurred but provided limited assistance in preventing those risks. In this research, we introduce a novel approach based on bidirectional long short-term memory (BiLSTM) networks that effectively captures and analyses the patterns of individual actions and their sequential dependencies. The focus is on predicting whether an individual would be a malicious insider in a future day based on their daily behavioural records over the previous several days. We analyse the performance of the four supervised learning algorithms on manual features, sequential features, and the ground truth of the day with different combinations. In addition, we investigate the performance of different RNN models, such as RNN, LSTM, and BiLSTM, in incorporating these features. Moreover, we explore the performance of different predictive lengths on the ground truth of the day and different embedded lengths for the sequential features. All the experiments are conducted on the CERT r4.2 dataset. Experiment results show that BiLSTM has the highest performance in combining these features.
ISSN:2364-1185
2364-1541