WordDGA: Hybrid Knowledge-Based Word-Level Domain Names Against DGA Classifiers and Adversarial DGAs

A Domain Generation Algorithm (DGA) employs botnets to generate domain names through a communication link between the C&C server and the bots. A DGA can generate pseudo-random AGDs (algorithmically generated domains) regularly, a handy method for detecting bots on the C&C server. Unlike curr...

Full description

Saved in:
Bibliographic Details
Main Authors: Sarojini Selvaraj, Rukmani Panjanathan
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Informatics
Subjects:
Online Access:https://www.mdpi.com/2227-9709/11/4/92
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846104311812063232
author Sarojini Selvaraj
Rukmani Panjanathan
author_facet Sarojini Selvaraj
Rukmani Panjanathan
author_sort Sarojini Selvaraj
collection DOAJ
description A Domain Generation Algorithm (DGA) employs botnets to generate domain names through a communication link between the C&C server and the bots. A DGA can generate pseudo-random AGDs (algorithmically generated domains) regularly, a handy method for detecting bots on the C&C server. Unlike current DGA detection methods, AGDs can be identified with lightweight, promising technology. DGAs can prolong the life of a viral operation, improving its profitability. Recent research on the sensitivity of deep learning to various adversarial DGAs has sought to enhance DGA detection techniques. They have character- and word-level classifiers; hybrid-level classifiers may detect and classify AGDs generated by DGAs, significantly diminishing the effectiveness of DGA classifiers. This work introduces WordDGA, a hybrid RCNN-BiLSTM-based adversarial DGA with strong anti-detection capabilities based on NLP and cWGAN, which offers word- and hybrid-level evasion techniques. It initially models the semantic relationships between benign and DGA domains by constructing a prediction model with a hybrid RCNN-BiLSTM network. To optimize the similarity between benign and DGA domain names, it modifies phrases from each input domain using the prediction model to detect DGA family categorizations. The experimental results reveal that dodging numerous wordlists and mixed-level DGA classifiers with training and testing sets improves word repetition rate, domain collision rate, attack success rate, and detection rate, indicating the usefulness of cWGAN-based oversampling in the face of adversarial DGAs.
format Article
id doaj-art-bcb385a81a4e4160b3fecb4a6657918e
institution Kabale University
issn 2227-9709
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Informatics
spelling doaj-art-bcb385a81a4e4160b3fecb4a6657918e2024-12-27T14:30:40ZengMDPI AGInformatics2227-97092024-11-011149210.3390/informatics11040092WordDGA: Hybrid Knowledge-Based Word-Level Domain Names Against DGA Classifiers and Adversarial DGAsSarojini Selvaraj0Rukmani Panjanathan1Vellore Institute of Technology (VIT) Chennai Campus, Chennai 600127, Tamil Nadu, IndiaVellore Institute of Technology (VIT) Chennai Campus, Chennai 600127, Tamil Nadu, IndiaA Domain Generation Algorithm (DGA) employs botnets to generate domain names through a communication link between the C&C server and the bots. A DGA can generate pseudo-random AGDs (algorithmically generated domains) regularly, a handy method for detecting bots on the C&C server. Unlike current DGA detection methods, AGDs can be identified with lightweight, promising technology. DGAs can prolong the life of a viral operation, improving its profitability. Recent research on the sensitivity of deep learning to various adversarial DGAs has sought to enhance DGA detection techniques. They have character- and word-level classifiers; hybrid-level classifiers may detect and classify AGDs generated by DGAs, significantly diminishing the effectiveness of DGA classifiers. This work introduces WordDGA, a hybrid RCNN-BiLSTM-based adversarial DGA with strong anti-detection capabilities based on NLP and cWGAN, which offers word- and hybrid-level evasion techniques. It initially models the semantic relationships between benign and DGA domains by constructing a prediction model with a hybrid RCNN-BiLSTM network. To optimize the similarity between benign and DGA domain names, it modifies phrases from each input domain using the prediction model to detect DGA family categorizations. The experimental results reveal that dodging numerous wordlists and mixed-level DGA classifiers with training and testing sets improves word repetition rate, domain collision rate, attack success rate, and detection rate, indicating the usefulness of cWGAN-based oversampling in the face of adversarial DGAs.https://www.mdpi.com/2227-9709/11/4/92cybersecurityDomain Generation Algorithms (DGAs)DNSword-based DGA botnetneural language modelsRCNN-BiLSTM
spellingShingle Sarojini Selvaraj
Rukmani Panjanathan
WordDGA: Hybrid Knowledge-Based Word-Level Domain Names Against DGA Classifiers and Adversarial DGAs
Informatics
cybersecurity
Domain Generation Algorithms (DGAs)
DNS
word-based DGA botnet
neural language models
RCNN-BiLSTM
title WordDGA: Hybrid Knowledge-Based Word-Level Domain Names Against DGA Classifiers and Adversarial DGAs
title_full WordDGA: Hybrid Knowledge-Based Word-Level Domain Names Against DGA Classifiers and Adversarial DGAs
title_fullStr WordDGA: Hybrid Knowledge-Based Word-Level Domain Names Against DGA Classifiers and Adversarial DGAs
title_full_unstemmed WordDGA: Hybrid Knowledge-Based Word-Level Domain Names Against DGA Classifiers and Adversarial DGAs
title_short WordDGA: Hybrid Knowledge-Based Word-Level Domain Names Against DGA Classifiers and Adversarial DGAs
title_sort worddga hybrid knowledge based word level domain names against dga classifiers and adversarial dgas
topic cybersecurity
Domain Generation Algorithms (DGAs)
DNS
word-based DGA botnet
neural language models
RCNN-BiLSTM
url https://www.mdpi.com/2227-9709/11/4/92
work_keys_str_mv AT sarojiniselvaraj worddgahybridknowledgebasedwordleveldomainnamesagainstdgaclassifiersandadversarialdgas
AT rukmanipanjanathan worddgahybridknowledgebasedwordleveldomainnamesagainstdgaclassifiersandadversarialdgas