Comparative Analysis of Feature Selection Methods with XGBoost for Malware Detection on the Drebin Dataset
Malware, or malicious software, continues to evolve alongside increasing cyberattacks targeting individual devices and critical infrastructure. Traditional detection methods, such as signature-based detection, are often ineffective against new or polymorphic malware. Therefore, advanced malware dete...
        Saved in:
      
    
          | Main Authors: | , , , , | 
|---|---|
| Format: | Article | 
| Language: | English | 
| Published: | LPPM ISB Atma Luhur
    
        2024-11-01 | 
| Series: | Jurnal Sisfokom | 
| Subjects: | |
| Online Access: | https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/view/2294 | 
| Tags: | Add Tag 
      No Tags, Be the first to tag this record!
   | 
| _version_ | 1846157786364248064 | 
|---|---|
| author | Ines Aulia Latifah Fauzi Adi Rafrastara Jevan Bintoro Wildanil Ghozi Waleed Mahgoub Osman | 
| author_facet | Ines Aulia Latifah Fauzi Adi Rafrastara Jevan Bintoro Wildanil Ghozi Waleed Mahgoub Osman | 
| author_sort | Ines Aulia Latifah | 
| collection | DOAJ | 
| description | Malware, or malicious software, continues to evolve alongside increasing cyberattacks targeting individual devices and critical infrastructure. Traditional detection methods, such as signature-based detection, are often ineffective against new or polymorphic malware. Therefore, advanced malware detection methods are increasingly needed to counter these evolving threats. This study aims to compare the performance of various feature selection methods combined with the XGBoost algorithm for malware detection using the Drebin dataset, and to identify the best feature selection method to enhance accuracy and efficiency. The experimental results show that XGBoost with the Information Gain method achieves the highest accuracy of 98.7%, with faster training times than other methods like Chi-Squared and ANOVA, which each achieved an accuracy of 98.3%. Information Gain yielded the best performance in accuracy and training time efficiency, while Chi-Squared and ANOVA offered competitive but slightly lower results. This study highlights that appropriate feature selection within machine learning algorithms can significantly improve malware detection accuracy, potentially aiding in real-world cybersecurity applications to prevent harmful cyberattacks. | 
| format | Article | 
| id | doaj-art-a62bbeebf7a242c084d7939df2d9bc23 | 
| institution | Kabale University | 
| issn | 2301-7988 2581-0588 | 
| language | English | 
| publishDate | 2024-11-01 | 
| publisher | LPPM ISB Atma Luhur | 
| record_format | Article | 
| series | Jurnal Sisfokom | 
| spelling | doaj-art-a62bbeebf7a242c084d7939df2d9bc232024-11-25T04:41:49ZengLPPM ISB Atma LuhurJurnal Sisfokom2301-79882581-05882024-11-0113340340910.32736/sisfokom.v13i3.2294902Comparative Analysis of Feature Selection Methods with XGBoost for Malware Detection on the Drebin DatasetInes Aulia Latifah0Fauzi Adi Rafrastara1Jevan Bintoro2Wildanil Ghozi3Waleed Mahgoub Osman4Department of Informatics Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, IndonesiaDepartment of Informatics Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, IndonesiaDepartment of Informatics Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, IndonesiaDepartment of Informatics Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, IndonesiaMathematics Department, College of Education Sudan University od Science and TechnologyMalware, or malicious software, continues to evolve alongside increasing cyberattacks targeting individual devices and critical infrastructure. Traditional detection methods, such as signature-based detection, are often ineffective against new or polymorphic malware. Therefore, advanced malware detection methods are increasingly needed to counter these evolving threats. This study aims to compare the performance of various feature selection methods combined with the XGBoost algorithm for malware detection using the Drebin dataset, and to identify the best feature selection method to enhance accuracy and efficiency. The experimental results show that XGBoost with the Information Gain method achieves the highest accuracy of 98.7%, with faster training times than other methods like Chi-Squared and ANOVA, which each achieved an accuracy of 98.3%. Information Gain yielded the best performance in accuracy and training time efficiency, while Chi-Squared and ANOVA offered competitive but slightly lower results. This study highlights that appropriate feature selection within machine learning algorithms can significantly improve malware detection accuracy, potentially aiding in real-world cybersecurity applications to prevent harmful cyberattacks.https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/view/2294android malware detectiondrebininformation gainxgboostmachine learning | 
| spellingShingle | Ines Aulia Latifah Fauzi Adi Rafrastara Jevan Bintoro Wildanil Ghozi Waleed Mahgoub Osman Comparative Analysis of Feature Selection Methods with XGBoost for Malware Detection on the Drebin Dataset Jurnal Sisfokom android malware detection drebin information gain xgboost machine learning | 
| title | Comparative Analysis of Feature Selection Methods with XGBoost for Malware Detection on the Drebin Dataset | 
| title_full | Comparative Analysis of Feature Selection Methods with XGBoost for Malware Detection on the Drebin Dataset | 
| title_fullStr | Comparative Analysis of Feature Selection Methods with XGBoost for Malware Detection on the Drebin Dataset | 
| title_full_unstemmed | Comparative Analysis of Feature Selection Methods with XGBoost for Malware Detection on the Drebin Dataset | 
| title_short | Comparative Analysis of Feature Selection Methods with XGBoost for Malware Detection on the Drebin Dataset | 
| title_sort | comparative analysis of feature selection methods with xgboost for malware detection on the drebin dataset | 
| topic | android malware detection drebin information gain xgboost machine learning | 
| url | https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/view/2294 | 
| work_keys_str_mv | AT inesaulialatifah comparativeanalysisoffeatureselectionmethodswithxgboostformalwaredetectiononthedrebindataset AT fauziadirafrastara comparativeanalysisoffeatureselectionmethodswithxgboostformalwaredetectiononthedrebindataset AT jevanbintoro comparativeanalysisoffeatureselectionmethodswithxgboostformalwaredetectiononthedrebindataset AT wildanilghozi comparativeanalysisoffeatureselectionmethodswithxgboostformalwaredetectiononthedrebindataset AT waleedmahgoubosman comparativeanalysisoffeatureselectionmethodswithxgboostformalwaredetectiononthedrebindataset | 
 
       