Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning

Learning from highly-imbalanced datasets is still a big challenge in the field of machine learning because models created by general learning algorithms are weak in recognizing the samples from the minority class correctly. Undersampling is an alternative kind of methods to deal with imbalanced lear...

Full description

Saved in:
Bibliographic Details
Main Author: Haiou Qin
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10820828/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Learning from highly-imbalanced datasets is still a big challenge in the field of machine learning because models created by general learning algorithms are weak in recognizing the samples from the minority class correctly. Undersampling is an alternative kind of methods to deal with imbalanced learning. In this paper, we propose a new undersampling method based on maximal information coefficient (including two algorithms MICU-1 and MICU-2) to rebalance the datasets. In order to evaluate the effectiveness of the method, 20 highly- imbalanced datasets are used for the benchmarks. Results show that compared with other undersampling methods, maximal information coefficient-based undersampling method are competitive in terms of G-mean and F-measure.
ISSN:2169-3536