Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning

Learning from highly-imbalanced datasets is still a big challenge in the field of machine learning because models created by general learning algorithms are weak in recognizing the samples from the minority class correctly. Undersampling is an alternative kind of methods to deal with imbalanced lear...

Full description

Saved in:
Bibliographic Details
Main Author: Haiou Qin
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10820828/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841550741879128064
author Haiou Qin
author_facet Haiou Qin
author_sort Haiou Qin
collection DOAJ
description Learning from highly-imbalanced datasets is still a big challenge in the field of machine learning because models created by general learning algorithms are weak in recognizing the samples from the minority class correctly. Undersampling is an alternative kind of methods to deal with imbalanced learning. In this paper, we propose a new undersampling method based on maximal information coefficient (including two algorithms MICU-1 and MICU-2) to rebalance the datasets. In order to evaluate the effectiveness of the method, 20 highly- imbalanced datasets are used for the benchmarks. Results show that compared with other undersampling methods, maximal information coefficient-based undersampling method are competitive in terms of G-mean and F-measure.
format Article
id doaj-art-1bcf831d9fce498c89082d5d5a0003a3
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-1bcf831d9fce498c89082d5d5a0003a32025-01-10T00:01:23ZengIEEEIEEE Access2169-35362025-01-01134126413510.1109/ACCESS.2025.352547510820828Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced LearningHaiou Qin0https://orcid.org/0009-0006-6773-9215School of Information Engineering, Nanchang Institute of Technology, Nanchang, ChinaLearning from highly-imbalanced datasets is still a big challenge in the field of machine learning because models created by general learning algorithms are weak in recognizing the samples from the minority class correctly. Undersampling is an alternative kind of methods to deal with imbalanced learning. In this paper, we propose a new undersampling method based on maximal information coefficient (including two algorithms MICU-1 and MICU-2) to rebalance the datasets. In order to evaluate the effectiveness of the method, 20 highly- imbalanced datasets are used for the benchmarks. Results show that compared with other undersampling methods, maximal information coefficient-based undersampling method are competitive in terms of G-mean and F-measure.https://ieeexplore.ieee.org/document/10820828/Imbalanced classificationimbalanced learningmaximal information coefficientmaximal information coefficient-based undersamplingundersampling
spellingShingle Haiou Qin
Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning
IEEE Access
Imbalanced classification
imbalanced learning
maximal information coefficient
maximal information coefficient-based undersampling
undersampling
title Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning
title_full Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning
title_fullStr Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning
title_full_unstemmed Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning
title_short Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning
title_sort maximal information coefficient based undersampling method for highly imbalanced learning
topic Imbalanced classification
imbalanced learning
maximal information coefficient
maximal information coefficient-based undersampling
undersampling
url https://ieeexplore.ieee.org/document/10820828/
work_keys_str_mv AT haiouqin maximalinformationcoefficientbasedundersamplingmethodforhighlyimbalancedlearning