LSRM: A New Method for Turkish Text Classification

The text classification method is one of the most frequently used approaches in text mining studies. Text classification requires a model generation using a predefined dataset, and this model aims to assign uncategorized data to a correct category. In line with this purpose, this study used machine...

Full description

Saved in:
Bibliographic Details
Main Author: Emin Borandağ
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/23/11143
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846124450126233600
author Emin Borandağ
author_facet Emin Borandağ
author_sort Emin Borandağ
collection DOAJ
description The text classification method is one of the most frequently used approaches in text mining studies. Text classification requires a model generation using a predefined dataset, and this model aims to assign uncategorized data to a correct category. In line with this purpose, this study used machine learning algorithms, deep learning algorithms, word embedding algorithms, and transfer-learning algorithms to classify Turkish texts using three diverse datasets, one of which is new, to analyze text classification performances for the Turkish language. The preparation process of the newly added dataset involved the variations in Turkish word usage patterns over the years, since it consisted of timestamp-enabled data. The study also developed a novel method named LSRM to increase the text classification performance for agglutinative languages such as Turkish. After testing the new method on datasets, the statistical ANOVA method revealed that applying the proposed LSRM method increased the classification performance.
format Article
id doaj-art-9be0a0a9a35642a5836f28352021adb8
institution Kabale University
issn 2076-3417
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-9be0a0a9a35642a5836f28352021adb82024-12-13T16:22:57ZengMDPI AGApplied Sciences2076-34172024-11-0114231114310.3390/app142311143LSRM: A New Method for Turkish Text ClassificationEmin Borandağ0Department of Software Engineering, Faculty of Technology, Manisa Celal Bayar University, 45140 Manisa, TurkeyThe text classification method is one of the most frequently used approaches in text mining studies. Text classification requires a model generation using a predefined dataset, and this model aims to assign uncategorized data to a correct category. In line with this purpose, this study used machine learning algorithms, deep learning algorithms, word embedding algorithms, and transfer-learning algorithms to classify Turkish texts using three diverse datasets, one of which is new, to analyze text classification performances for the Turkish language. The preparation process of the newly added dataset involved the variations in Turkish word usage patterns over the years, since it consisted of timestamp-enabled data. The study also developed a novel method named LSRM to increase the text classification performance for agglutinative languages such as Turkish. After testing the new method on datasets, the statistical ANOVA method revealed that applying the proposed LSRM method increased the classification performance.https://www.mdpi.com/2076-3417/14/23/11143text classificationtext categorizationmachine learningdeep learningCNNLSTM
spellingShingle Emin Borandağ
LSRM: A New Method for Turkish Text Classification
Applied Sciences
text classification
text categorization
machine learning
deep learning
CNN
LSTM
title LSRM: A New Method for Turkish Text Classification
title_full LSRM: A New Method for Turkish Text Classification
title_fullStr LSRM: A New Method for Turkish Text Classification
title_full_unstemmed LSRM: A New Method for Turkish Text Classification
title_short LSRM: A New Method for Turkish Text Classification
title_sort lsrm a new method for turkish text classification
topic text classification
text categorization
machine learning
deep learning
CNN
LSTM
url https://www.mdpi.com/2076-3417/14/23/11143
work_keys_str_mv AT eminborandag lsrmanewmethodforturkishtextclassification