Enhancing Moroccan Dialect Sentiment Analysis Through Optimized Preprocessing and Transfer Learning Techniques
This work investigates the challenges of sentiment analysis for Moroccan Arabic dialect (MD), where the lack of dialect-specific preprocessing methods complicates natural language processing tasks and affects sentiment classification performance. The research evaluates various preprocessing techniqu...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10788699/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This work investigates the challenges of sentiment analysis for Moroccan Arabic dialect (MD), where the lack of dialect-specific preprocessing methods complicates natural language processing tasks and affects sentiment classification performance. The research evaluates various preprocessing techniques, including stemming and feature extraction, using two main transfer learning approaches: feature extraction with deep learning models and fine-tuning pre-trained models. Experimentations were conducted on four MD datasets to assess combinations of stemmers, feature extractors, and architectures. In the feature extraction approach, omitting stemming and employing the QARiB feature extractor with a BiGRU model yielded the highest accuracy on the FB and MAC datasets, reaching 90.45% and 75.50%, respectively. In the fine-tuning approach, DarijaBERT excelled on the FB dataset with an accuracy of 93.37% and an F1-score of 88.55%, while QaRIB and AraBERT performed comparably well on the MAC and MSAC datasets. Results suggest that excluding base form reduction methods, such as stemming and lemmatization, during fine-tuning enhances sentiment analysis performance in MD, highlighting the limitations of Modern Standard Arabic techniques for MD processing. This study provides valuable insights for improving Natural language processing (NLP) applications in Arabic dialects, particularly in sentiment analysis, by optimizing model performance without relying on standard preprocessing methods. |
|---|---|
| ISSN: | 2169-3536 |