Enhanced Category-Feature Association Measure
Text classification is one of the severe challenges for categorizing large and high-dimensional text data accurately and efficiently. Many features confuse the classification process, and feature selection (FS) strategies should be used to deal with the problem of high dimensionality. This paper pr...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Koya University
2025-08-01
|
| Series: | ARO-The Scientific Journal of Koya University |
| Subjects: | |
| Online Access: | https://aro.koyauniversity.org/index.php/aro/article/view/2034 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849228952430706688 |
|---|---|
| author | Soran S. Badawi Ari M. Saeed Sara A. Ahmed Diyari A. Hassan |
| author_facet | Soran S. Badawi Ari M. Saeed Sara A. Ahmed Diyari A. Hassan |
| author_sort | Soran S. Badawi |
| collection | DOAJ |
| description |
Text classification is one of the severe challenges for categorizing large and high-dimensional text data accurately and efficiently. Many features confuse the classification process, and feature selection (FS) strategies should be used to deal with the problem of high dimensionality. This paper proposes a novel FS technique based on enhanced category-feature association measure (ECFAM). ECFAM utilizes the existence and elimination of terms and the complicated relationships among the terms across different sections. This one-of-a-kind approach emphasizes the key role of ancillary terms in classifying and differentiating categories. The comparison is done on two important datasets, Reuters-21578 and 20-Newsgroups, through two widely employed supervised machine learning classifiers and one deep learning algorithm. Throughout our experiments, we investigate the feature sizes in nine different feature sets, ranging from 50 to 4000. Experimental data show that ECFAM always performs better than other methods concerning accuracy and computational cost.
|
| format | Article |
| id | doaj-art-a038ff00cef24f04b959da7b2c67bb9d |
| institution | Kabale University |
| issn | 2410-9355 2307-549X |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Koya University |
| record_format | Article |
| series | ARO-The Scientific Journal of Koya University |
| spelling | doaj-art-a038ff00cef24f04b959da7b2c67bb9d2025-08-22T10:18:52ZengKoya UniversityARO-The Scientific Journal of Koya University2410-93552307-549X2025-08-0113210.14500/aro.12034Enhanced Category-Feature Association MeasureSoran S. Badawi0https://orcid.org/0000-0001-9117-3078Ari M. Saeed1https://orcid.org/0000-0003-1350-9386Sara A. Ahmed2https://orcid.org/0000-0001-7330-6105Diyari A. Hassan3https://orcid.org/0000-0003-0710-1923Language Center, Charmo University, Chamchamal, KRG, Iraq., Kurdistan Region – F.R. IraqDepartment of Computer Science, University of Halabja, Halabja, Kurdistan Region – F.R. IraqDepartment of Computer Engineering, Komar University of Science and Technology, Sulaimaniyah, Kurdistan Region – F.R. IraqDepartment of Biomedical Engineering, Faculty of Engineering and Computer Science, Qaiwan International University, Sulaimaniyah, Kurdistan Region – F.R. Iraq Text classification is one of the severe challenges for categorizing large and high-dimensional text data accurately and efficiently. Many features confuse the classification process, and feature selection (FS) strategies should be used to deal with the problem of high dimensionality. This paper proposes a novel FS technique based on enhanced category-feature association measure (ECFAM). ECFAM utilizes the existence and elimination of terms and the complicated relationships among the terms across different sections. This one-of-a-kind approach emphasizes the key role of ancillary terms in classifying and differentiating categories. The comparison is done on two important datasets, Reuters-21578 and 20-Newsgroups, through two widely employed supervised machine learning classifiers and one deep learning algorithm. Throughout our experiments, we investigate the feature sizes in nine different feature sets, ranging from 50 to 4000. Experimental data show that ECFAM always performs better than other methods concerning accuracy and computational cost. https://aro.koyauniversity.org/index.php/aro/article/view/2034Dimension reductionFeature selectionLong short-term memoryMultinomial Naive BayesSupport vector machinesText classification |
| spellingShingle | Soran S. Badawi Ari M. Saeed Sara A. Ahmed Diyari A. Hassan Enhanced Category-Feature Association Measure ARO-The Scientific Journal of Koya University Dimension reduction Feature selection Long short-term memory Multinomial Naive Bayes Support vector machines Text classification |
| title | Enhanced Category-Feature Association Measure |
| title_full | Enhanced Category-Feature Association Measure |
| title_fullStr | Enhanced Category-Feature Association Measure |
| title_full_unstemmed | Enhanced Category-Feature Association Measure |
| title_short | Enhanced Category-Feature Association Measure |
| title_sort | enhanced category feature association measure |
| topic | Dimension reduction Feature selection Long short-term memory Multinomial Naive Bayes Support vector machines Text classification |
| url | https://aro.koyauniversity.org/index.php/aro/article/view/2034 |
| work_keys_str_mv | AT soransbadawi enhancedcategoryfeatureassociationmeasure AT arimsaeed enhancedcategoryfeatureassociationmeasure AT saraaahmed enhancedcategoryfeatureassociationmeasure AT diyariahassan enhancedcategoryfeatureassociationmeasure |