LSRM: A New Method for Turkish Text Classification
The text classification method is one of the most frequently used approaches in text mining studies. Text classification requires a model generation using a predefined dataset, and this model aims to assign uncategorized data to a correct category. In line with this purpose, this study used machine...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-11-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/14/23/11143 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846124450126233600 |
|---|---|
| author | Emin Borandağ |
| author_facet | Emin Borandağ |
| author_sort | Emin Borandağ |
| collection | DOAJ |
| description | The text classification method is one of the most frequently used approaches in text mining studies. Text classification requires a model generation using a predefined dataset, and this model aims to assign uncategorized data to a correct category. In line with this purpose, this study used machine learning algorithms, deep learning algorithms, word embedding algorithms, and transfer-learning algorithms to classify Turkish texts using three diverse datasets, one of which is new, to analyze text classification performances for the Turkish language. The preparation process of the newly added dataset involved the variations in Turkish word usage patterns over the years, since it consisted of timestamp-enabled data. The study also developed a novel method named LSRM to increase the text classification performance for agglutinative languages such as Turkish. After testing the new method on datasets, the statistical ANOVA method revealed that applying the proposed LSRM method increased the classification performance. |
| format | Article |
| id | doaj-art-9be0a0a9a35642a5836f28352021adb8 |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-9be0a0a9a35642a5836f28352021adb82024-12-13T16:22:57ZengMDPI AGApplied Sciences2076-34172024-11-0114231114310.3390/app142311143LSRM: A New Method for Turkish Text ClassificationEmin Borandağ0Department of Software Engineering, Faculty of Technology, Manisa Celal Bayar University, 45140 Manisa, TurkeyThe text classification method is one of the most frequently used approaches in text mining studies. Text classification requires a model generation using a predefined dataset, and this model aims to assign uncategorized data to a correct category. In line with this purpose, this study used machine learning algorithms, deep learning algorithms, word embedding algorithms, and transfer-learning algorithms to classify Turkish texts using three diverse datasets, one of which is new, to analyze text classification performances for the Turkish language. The preparation process of the newly added dataset involved the variations in Turkish word usage patterns over the years, since it consisted of timestamp-enabled data. The study also developed a novel method named LSRM to increase the text classification performance for agglutinative languages such as Turkish. After testing the new method on datasets, the statistical ANOVA method revealed that applying the proposed LSRM method increased the classification performance.https://www.mdpi.com/2076-3417/14/23/11143text classificationtext categorizationmachine learningdeep learningCNNLSTM |
| spellingShingle | Emin Borandağ LSRM: A New Method for Turkish Text Classification Applied Sciences text classification text categorization machine learning deep learning CNN LSTM |
| title | LSRM: A New Method for Turkish Text Classification |
| title_full | LSRM: A New Method for Turkish Text Classification |
| title_fullStr | LSRM: A New Method for Turkish Text Classification |
| title_full_unstemmed | LSRM: A New Method for Turkish Text Classification |
| title_short | LSRM: A New Method for Turkish Text Classification |
| title_sort | lsrm a new method for turkish text classification |
| topic | text classification text categorization machine learning deep learning CNN LSTM |
| url | https://www.mdpi.com/2076-3417/14/23/11143 |
| work_keys_str_mv | AT eminborandag lsrmanewmethodforturkishtextclassification |