Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects

Bug triaging–the process of classifying and assigning software issues to appropriate developers–is a critical yet challenging task in large-scale software development. Manual triaging is time-consuming, inconsistent, and prone to human bias, which often delays issue resolution...

Full description

Saved in:

Bibliographic Details
Main Authors:	Nitanta Adhikari, Rabindra Bista, Joao Carlos Ferreira
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Bug triaging natural language processing (NLP) multi-label classification model evaluation metrics
Online Access:	https://ieeexplore.ieee.org/document/11106424/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849239082257874944
author	Nitanta Adhikari Rabindra Bista Joao Carlos Ferreira
author_facet	Nitanta Adhikari Rabindra Bista Joao Carlos Ferreira
author_sort	Nitanta Adhikari
collection	DOAJ
description	Bug triaging–the process of classifying and assigning software issues to appropriate developers–is a critical yet challenging task in large-scale software development. Manual triaging is time-consuming, inconsistent, and prone to human bias, which often delays issue resolution and misallocates developer resources. This study explores the application of machine learning to automate and improve bug triaging efficiency and accuracy. Using a dataset of over 122,000 issues from the microsoft/vscode GitHub repository, we evaluate several machine learning models including Bidirectional LSTM, CNN-LSTM, Random Forest, and Multinomial Naive Bayes. Our primary contribution is the development of an Augmented Bidirectional LSTM model that integrates enriched textual features and contextual metadata. This model, optimized using Optuna, outperforms traditional baselines, achieving a Micro F1-score of 0.6469 and Hamming Loss of 0.0133 for label prediction, and a Micro F1-score of 0.5974 with Hamming Loss of 0.0062 for assignee recommendation. In addition to demonstrating strong predictive performance, we present a robust end-to-end pipeline for data preprocessing, augmentation, model training, and evaluation using multi-label classification techniques. The study highlights how deep learning architectures, in combination with feature engineering and hyperparameter tuning, can provide scalable and generalizable components to support the automation of bug triaging. These findings contribute to the growing field of intelligent software maintenance by offering data-driven approaches that can support developer workflows and improve issue management efficiency in open-source environments.
format	Article
id	doaj-art-2d231003b7da4ea59c8dd2a51088f9e9
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-2d231003b7da4ea59c8dd2a51088f9e92025-08-20T04:01:15ZengIEEEIEEE Access2169-35362025-01-011313623713625410.1109/ACCESS.2025.359501111106424Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software ProjectsNitanta Adhikari0https://orcid.org/0009-0005-9048-1577Rabindra Bista1https://orcid.org/0000-0002-0638-5840Joao Carlos Ferreira2https://orcid.org/0000-0002-6662-0806Department of Computer Science and Engineering, Kathmandu University, Kavre, Dhulikhel, NepalDepartment of Computer Science and Engineering, Kathmandu University, Kavre, Dhulikhel, NepalFaculty of Logistics, Molde University College, Molde, NorwayBug triaging–the process of classifying and assigning software issues to appropriate developers–is a critical yet challenging task in large-scale software development. Manual triaging is time-consuming, inconsistent, and prone to human bias, which often delays issue resolution and misallocates developer resources. This study explores the application of machine learning to automate and improve bug triaging efficiency and accuracy. Using a dataset of over 122,000 issues from the microsoft/vscode GitHub repository, we evaluate several machine learning models including Bidirectional LSTM, CNN-LSTM, Random Forest, and Multinomial Naive Bayes. Our primary contribution is the development of an Augmented Bidirectional LSTM model that integrates enriched textual features and contextual metadata. This model, optimized using Optuna, outperforms traditional baselines, achieving a Micro F1-score of 0.6469 and Hamming Loss of 0.0133 for label prediction, and a Micro F1-score of 0.5974 with Hamming Loss of 0.0062 for assignee recommendation. In addition to demonstrating strong predictive performance, we present a robust end-to-end pipeline for data preprocessing, augmentation, model training, and evaluation using multi-label classification techniques. The study highlights how deep learning architectures, in combination with feature engineering and hyperparameter tuning, can provide scalable and generalizable components to support the automation of bug triaging. These findings contribute to the growing field of intelligent software maintenance by offering data-driven approaches that can support developer workflows and improve issue management efficiency in open-source environments.https://ieeexplore.ieee.org/document/11106424/Bug triagingnatural language processing (NLP)multi-label classificationmodel evaluation metrics
spellingShingle	Nitanta Adhikari Rabindra Bista Joao Carlos Ferreira Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects IEEE Access Bug triaging natural language processing (NLP) multi-label classification model evaluation metrics
title	Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects
title_full	Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects
title_fullStr	Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects
title_full_unstemmed	Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects
title_short	Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects
title_sort	leveraging machine learning for enhanced bug triaging in open source software projects
topic	Bug triaging natural language processing (NLP) multi-label classification model evaluation metrics
url	https://ieeexplore.ieee.org/document/11106424/
work_keys_str_mv	AT nitantaadhikari leveragingmachinelearningforenhancedbugtriaginginopensourcesoftwareprojects AT rabindrabista leveragingmachinelearningforenhancedbugtriaginginopensourcesoftwareprojects AT joaocarlosferreira leveragingmachinelearningforenhancedbugtriaginginopensourcesoftwareprojects

Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects

Similar Items