Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and Tools

The development of reliable software depends heavily on the effective collaboration between teams responsible for development and testing. Despite ongoing efforts, many software programs still contain bugs that can lead to financial losses and business risks. Therefore, detecting and fixing software...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Mustaqeem, Mahfooz Alam, Suhel Mustajab, Faisal Alshanketi, Shadab Alam, Mohammed Shuaib
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10798423/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841563355502870528
author Mohd Mustaqeem
Mahfooz Alam
Suhel Mustajab
Faisal Alshanketi
Shadab Alam
Mohammed Shuaib
author_facet Mohd Mustaqeem
Mahfooz Alam
Suhel Mustajab
Faisal Alshanketi
Shadab Alam
Mohammed Shuaib
author_sort Mohd Mustaqeem
collection DOAJ
description The development of reliable software depends heavily on the effective collaboration between teams responsible for development and testing. Despite ongoing efforts, many software programs still contain bugs that can lead to financial losses and business risks. Therefore, detecting and fixing software defects after release is crucial. While binary classification methods have been commonly used for this purpose, recent Artificial Intelligence (AI) advancements offer new opportunities for software teams to create more robust software. To address challenges in Software Defect Prediction (SDP), we conducted a thorough bibliographic survey of 79 research articles from the year 2011 to 2023 that examined previous models, datasets, data validation techniques, defect detection, prediction methods, and SDP tools. The survey revealed that previous research often lacked appropriate datasets with the necessary characteristics and data validation methods. Additionally, many standard datasets suffer from a lack of labels, which hinders effective defect detection. Systematic literature reviews on SDP are scarce, further emphasizing the importance of this study. Based on the findings, we provide crucial recommendations for designing effective SDP models and tools. The proposed survey outlines an architecture for constructing SDP datasets with the appropriate characteristics, as well as multi-label classification and data validation methodologies for software defects. This approach aims to enhance SDP research and contribute to the development of high-quality software products by improving defect prediction accuracy.
format Article
id doaj-art-3d951360234845588578c00a7e24cb25
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-3d951360234845588578c00a7e24cb252025-01-03T00:01:36ZengIEEEIEEE Access2169-35362025-01-011386690310.1109/ACCESS.2024.351741910798423Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and ToolsMohd Mustaqeem0https://orcid.org/0000-0001-5055-5969Mahfooz Alam1https://orcid.org/0000-0003-0668-9796Suhel Mustajab2https://orcid.org/0000-0002-9969-6110Faisal Alshanketi3https://orcid.org/0000-0001-5982-5937Shadab Alam4https://orcid.org/0000-0003-0504-4515Mohammed Shuaib5https://orcid.org/0000-0001-6657-2587Department of Computer Science, Aligarh Muslim University, Aligarh, IndiaDepartment of Computer Science, Aligarh Muslim University, Aligarh, IndiaDepartment of Computer Science, Aligarh Muslim University, Aligarh, IndiaDepartment of Computer Science, College of Engineering and Computer Science, Jazan University, Jazan, Saudi ArabiaDepartment of Computer Science, College of Engineering and Computer Science, Jazan University, Jazan, Saudi ArabiaDepartment of Computer Science, College of Engineering and Computer Science, Jazan University, Jazan, Saudi ArabiaThe development of reliable software depends heavily on the effective collaboration between teams responsible for development and testing. Despite ongoing efforts, many software programs still contain bugs that can lead to financial losses and business risks. Therefore, detecting and fixing software defects after release is crucial. While binary classification methods have been commonly used for this purpose, recent Artificial Intelligence (AI) advancements offer new opportunities for software teams to create more robust software. To address challenges in Software Defect Prediction (SDP), we conducted a thorough bibliographic survey of 79 research articles from the year 2011 to 2023 that examined previous models, datasets, data validation techniques, defect detection, prediction methods, and SDP tools. The survey revealed that previous research often lacked appropriate datasets with the necessary characteristics and data validation methods. Additionally, many standard datasets suffer from a lack of labels, which hinders effective defect detection. Systematic literature reviews on SDP are scarce, further emphasizing the importance of this study. Based on the findings, we provide crucial recommendations for designing effective SDP models and tools. The proposed survey outlines an architecture for constructing SDP datasets with the appropriate characteristics, as well as multi-label classification and data validation methodologies for software defects. This approach aims to enhance SDP research and contribute to the development of high-quality software products by improving defect prediction accuracy.https://ieeexplore.ieee.org/document/10798423/Software defect predictionclassificationartificial intelligencemachine learningstatistical validationbibliographic survey
spellingShingle Mohd Mustaqeem
Mahfooz Alam
Suhel Mustajab
Faisal Alshanketi
Shadab Alam
Mohammed Shuaib
Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and Tools
IEEE Access
Software defect prediction
classification
artificial intelligence
machine learning
statistical validation
bibliographic survey
title Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and Tools
title_full Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and Tools
title_fullStr Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and Tools
title_full_unstemmed Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and Tools
title_short Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and Tools
title_sort comprehensive bibliographic survey and forward looking recommendations for software defect prediction datasets validation methodologies prediction approaches and tools
topic Software defect prediction
classification
artificial intelligence
machine learning
statistical validation
bibliographic survey
url https://ieeexplore.ieee.org/document/10798423/
work_keys_str_mv AT mohdmustaqeem comprehensivebibliographicsurveyandforwardlookingrecommendationsforsoftwaredefectpredictiondatasetsvalidationmethodologiespredictionapproachesandtools
AT mahfoozalam comprehensivebibliographicsurveyandforwardlookingrecommendationsforsoftwaredefectpredictiondatasetsvalidationmethodologiespredictionapproachesandtools
AT suhelmustajab comprehensivebibliographicsurveyandforwardlookingrecommendationsforsoftwaredefectpredictiondatasetsvalidationmethodologiespredictionapproachesandtools
AT faisalalshanketi comprehensivebibliographicsurveyandforwardlookingrecommendationsforsoftwaredefectpredictiondatasetsvalidationmethodologiespredictionapproachesandtools
AT shadabalam comprehensivebibliographicsurveyandforwardlookingrecommendationsforsoftwaredefectpredictiondatasetsvalidationmethodologiespredictionapproachesandtools
AT mohammedshuaib comprehensivebibliographicsurveyandforwardlookingrecommendationsforsoftwaredefectpredictiondatasetsvalidationmethodologiespredictionapproachesandtools