Automatic Real-Time Mining Software Process Activities From SVN Logs Using a Naive Bayes Classifier

The abundance of event data in current software configuration management systems makes it possible to discover software process models automatically by using actual observed behavior. However, traditional process mining algorithms cannot be applied to event logs recorded in software configuration ma...

Full description

Saved in:
Bibliographic Details
Main Authors: Rui Zhu, Yichao Dai, Tong Li, Zifei Ma, Ming Zheng, Yahui Tang, Jiayi Yuan, Yue Huang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8864026/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846152002321514496
author Rui Zhu
Yichao Dai
Tong Li
Zifei Ma
Ming Zheng
Yahui Tang
Jiayi Yuan
Yue Huang
author_facet Rui Zhu
Yichao Dai
Tong Li
Zifei Ma
Ming Zheng
Yahui Tang
Jiayi Yuan
Yue Huang
author_sort Rui Zhu
collection DOAJ
description The abundance of event data in current software configuration management systems makes it possible to discover software process models automatically by using actual observed behavior. However, traditional process mining algorithms cannot be applied to event logs recorded in software configuration management (SCM) systems, such as SVN, because of missing activity attributes. To address this problem, a software process activity classifier is proposed to build event-activity mapping relationships from software development event streams, revealing activity attributes and associating the activity to the original SVN log. The proposed approach extracts activity from the SVN log based on semantic features and introduces a novel technique based on a naive Bayes approach to associate event activities dynamically. The approach has been applied to two real-world software development process logs, <italic>ArgoUML</italic> and <italic>jEdit</italic>, consisting of more than 80,000 events, covering development information from 1998 to 2015. With the application of our approach to such data, activities can be extracted from event logs and a classifier can be constructed for adding activity attributes to new events. The results of the classification are evaluated in terms of <italic>precision rate</italic>, <italic>recall rate,</italic> and the <italic>F-measure</italic>. Overall, two real-world software development process logs are used to validate the method, and the experimental results show that the approach can mine software process activities from SVN log events automatically and in real-time.
format Article
id doaj-art-c53be28f6f8c4cf6a9178b42dad0ca8a
institution Kabale University
issn 2169-3536
language English
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-c53be28f6f8c4cf6a9178b42dad0ca8a2024-11-27T00:00:17ZengIEEEIEEE Access2169-35362019-01-01714640314641510.1109/ACCESS.2019.29456088864026Automatic Real-Time Mining Software Process Activities From SVN Logs Using a Naive Bayes ClassifierRui Zhu0https://orcid.org/0000-0002-5445-963XYichao Dai1Tong Li2Zifei Ma3Ming Zheng4Yahui Tang5Jiayi Yuan6Yue Huang7School of Software, Yunnan University, Kunming, ChinaCollege of Computer, National University of Defense Technology, Changsha, ChinaKey Laboratory in Software Engineering of Yunnan Province, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaThe abundance of event data in current software configuration management systems makes it possible to discover software process models automatically by using actual observed behavior. However, traditional process mining algorithms cannot be applied to event logs recorded in software configuration management (SCM) systems, such as SVN, because of missing activity attributes. To address this problem, a software process activity classifier is proposed to build event-activity mapping relationships from software development event streams, revealing activity attributes and associating the activity to the original SVN log. The proposed approach extracts activity from the SVN log based on semantic features and introduces a novel technique based on a naive Bayes approach to associate event activities dynamically. The approach has been applied to two real-world software development process logs, <italic>ArgoUML</italic> and <italic>jEdit</italic>, consisting of more than 80,000 events, covering development information from 1998 to 2015. With the application of our approach to such data, activities can be extracted from event logs and a classifier can be constructed for adding activity attributes to new events. The results of the classification are evaluated in terms of <italic>precision rate</italic>, <italic>recall rate,</italic> and the <italic>F-measure</italic>. Overall, two real-world software development process logs are used to validate the method, and the experimental results show that the approach can mine software process activities from SVN log events automatically and in real-time.https://ieeexplore.ieee.org/document/8864026/Activity classifiermachine learningsoftware process activitySVN log
spellingShingle Rui Zhu
Yichao Dai
Tong Li
Zifei Ma
Ming Zheng
Yahui Tang
Jiayi Yuan
Yue Huang
Automatic Real-Time Mining Software Process Activities From SVN Logs Using a Naive Bayes Classifier
IEEE Access
Activity classifier
machine learning
software process activity
SVN log
title Automatic Real-Time Mining Software Process Activities From SVN Logs Using a Naive Bayes Classifier
title_full Automatic Real-Time Mining Software Process Activities From SVN Logs Using a Naive Bayes Classifier
title_fullStr Automatic Real-Time Mining Software Process Activities From SVN Logs Using a Naive Bayes Classifier
title_full_unstemmed Automatic Real-Time Mining Software Process Activities From SVN Logs Using a Naive Bayes Classifier
title_short Automatic Real-Time Mining Software Process Activities From SVN Logs Using a Naive Bayes Classifier
title_sort automatic real time mining software process activities from svn logs using a naive bayes classifier
topic Activity classifier
machine learning
software process activity
SVN log
url https://ieeexplore.ieee.org/document/8864026/
work_keys_str_mv AT ruizhu automaticrealtimeminingsoftwareprocessactivitiesfromsvnlogsusinganaivebayesclassifier
AT yichaodai automaticrealtimeminingsoftwareprocessactivitiesfromsvnlogsusinganaivebayesclassifier
AT tongli automaticrealtimeminingsoftwareprocessactivitiesfromsvnlogsusinganaivebayesclassifier
AT zifeima automaticrealtimeminingsoftwareprocessactivitiesfromsvnlogsusinganaivebayesclassifier
AT mingzheng automaticrealtimeminingsoftwareprocessactivitiesfromsvnlogsusinganaivebayesclassifier
AT yahuitang automaticrealtimeminingsoftwareprocessactivitiesfromsvnlogsusinganaivebayesclassifier
AT jiayiyuan automaticrealtimeminingsoftwareprocessactivitiesfromsvnlogsusinganaivebayesclassifier
AT yuehuang automaticrealtimeminingsoftwareprocessactivitiesfromsvnlogsusinganaivebayesclassifier