Comparing a Thai Words Segmentation Methods in the LST20 Dataset

In this era of globalization where information is widely available, organizations are increasingly placing importance on using information to enhance their business. Although data is easily available, there are still challenges in natural language processing tasks, especially, the division of Thai w...

Full description

Saved in:
Bibliographic Details
Main Authors: Krittapol Damrongkamoltip, Khatcha Ruenlek, Wasit Limprasert, Prachya Boonkwan
Format: Article
Language:English
Published: Surindra Rajabhat University, Faculty of Science and Technology, Department of Computer Education 2024-08-01
Series:Journal of Computer and Creative Technology
Subjects:
Online Access:https://so13.tci-thaijo.org/index.php/jcct/article/view/679
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this era of globalization where information is widely available, organizations are increasingly placing importance on using information to enhance their business. Although data is easily available, there are still challenges in natural language processing tasks, especially, the division of Thai words that lacks clarity of word boundaries, etc. This makes it difficult to identify the word groups in a sentence appropriately. Therefore, this study focuses on evaluating the performance of the word segmentation method including the Dictionary use and learning from data using evaluation of word segmentation in six techniques are important goals for the verification of the literal level accuracy and processing time of each method and technique, by the LST20 dataset contains 3,745 documents and covers 15 news categories in results show a more efficient way to learn from data.
ISSN:2985-1580
2985-1599