LSRM: A New Method for Turkish Text Classification

The text classification method is one of the most frequently used approaches in text mining studies. Text classification requires a model generation using a predefined dataset, and this model aims to assign uncategorized data to a correct category. In line with this purpose, this study used machine...

Full description

Saved in:

Bibliographic Details
Main Author:	Emin Borandağ
Format:	Article
Language:	English
Published:	MDPI AG 2024-11-01
Series:	Applied Sciences
Subjects:	text classification text categorization machine learning deep learning CNN LSTM
Online Access:	https://www.mdpi.com/2076-3417/14/23/11143
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The text classification method is one of the most frequently used approaches in text mining studies. Text classification requires a model generation using a predefined dataset, and this model aims to assign uncategorized data to a correct category. In line with this purpose, this study used machine learning algorithms, deep learning algorithms, word embedding algorithms, and transfer-learning algorithms to classify Turkish texts using three diverse datasets, one of which is new, to analyze text classification performances for the Turkish language. The preparation process of the newly added dataset involved the variations in Turkish word usage patterns over the years, since it consisted of timestamp-enabled data. The study also developed a novel method named LSRM to increase the text classification performance for agglutinative languages such as Turkish. After testing the new method on datasets, the statistical ANOVA method revealed that applying the proposed LSRM method increased the classification performance.
ISSN:	2076-3417

LSRM: A New Method for Turkish Text Classification

Similar Items