Feature selection method for software defect number prediction based on maximum information coefficient
The traditional feature selection method only considers the linear correlation between variables and ignores the nonlinear correlation, so it is difficult to select effective feature subsets to build the effective model to predict the number of faults in software modules.Considering the linear and n...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Beijing Xintong Media Co., Ltd
2021-05-01
|
Series: | Dianxin kexue |
Subjects: | |
Online Access: | http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2021025/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The traditional feature selection method only considers the linear correlation between variables and ignores the nonlinear correlation, so it is difficult to select effective feature subsets to build the effective model to predict the number of faults in software modules.Considering the linear and nonlinear relationship, a feature selection method based on maximum information coefficient (MIC) was proposed.The proposed method separated the redundancy analysis and correlation analysis into two phases.In the previous phase, the cluster algorithm, which was based on the correlation between features, was used to divide the redundant features into the same cluster.In the later phase, the features in each cluster were sorted in descending order according to the correlation between features and the number of software defects, and then the top features were selected to form the feature subset.The experimental results show that the proposed method can improve the prediction performance of software defect number prediction model by effectively removing redundant and irrelevant features. |
---|---|
ISSN: | 1000-0801 |