Predicting multi-label emojis, emotions, and sentiments in code-mixed texts using an emojifying sentiments framework

Abstract In the era of social media, the use of emojis and code-mixed language has become essential in online communication. However, selecting the appropriate emoji that matches a particular sentiment or emotion in the code-mixed text can be difficult. This paper presents a novel task of predicting...

Full description

Saved in:
Bibliographic Details
Main Authors: Gopendra Vikram Singh, Soumitra Ghosh, Mauajama Firdaus, Asif Ekbal, Pushpak Bhattacharyya
Format: Article
Language:English
Published: Nature Portfolio 2024-05-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-58944-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846165342592696320
author Gopendra Vikram Singh
Soumitra Ghosh
Mauajama Firdaus
Asif Ekbal
Pushpak Bhattacharyya
author_facet Gopendra Vikram Singh
Soumitra Ghosh
Mauajama Firdaus
Asif Ekbal
Pushpak Bhattacharyya
author_sort Gopendra Vikram Singh
collection DOAJ
description Abstract In the era of social media, the use of emojis and code-mixed language has become essential in online communication. However, selecting the appropriate emoji that matches a particular sentiment or emotion in the code-mixed text can be difficult. This paper presents a novel task of predicting multiple emojis in English-Hindi code-mixed sentences and proposes a new dataset called SENTIMOJI, which extends the SemEval 2020 Task 9 SentiMix dataset. Our approach is based on exploiting the relationship between emotion, sentiment, and emojis to build an end-to-end framework. We replace the self-attention sublayers in the transformer encoder with simple linear transformations and use the RMS-layer norm instead of the normal layer norm. Moreover, we employ Gated Linear Unit and Fully Connected layers to predict emojis and identify the emotion and sentiment of a tweet. Our experimental results on the SENTIMOJI dataset demonstrate that the proposed multi-task framework outperforms the single-task framework. We also show that emojis are strongly linked to sentiment and emotion and that identifying sentiment and emotion can aid in accurately predicting the most suitable emoji. Our work contributes to the field of natural language processing and can help in the development of more effective tools for sentiment analysis and emotion recognition in code-mixed languages. The codes and data will be available at https://www.iitp.ac.in/~ai-nlp-ml/resources.html#SENTIMOJI to facilitate research.
format Article
id doaj-art-966657643b284222b6bbd10f9a6da9ae
institution Kabale University
issn 2045-2322
language English
publishDate 2024-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-966657643b284222b6bbd10f9a6da9ae2024-11-17T12:24:06ZengNature PortfolioScientific Reports2045-23222024-05-0114112410.1038/s41598-024-58944-5Predicting multi-label emojis, emotions, and sentiments in code-mixed texts using an emojifying sentiments frameworkGopendra Vikram Singh0Soumitra Ghosh1Mauajama Firdaus2Asif Ekbal3Pushpak Bhattacharyya4Department of Computer Science and Engineering, Indian Institute of Technology PatnaDepartment of Computer Science and Engineering, Indian Institute of Technology PatnaUniversity of AlbertaDepartment of Computer Science and Engineering, Indian Institute of Technology PatnaDepartment of Computer Science and Engineering, Indian Institute of Technology BombayAbstract In the era of social media, the use of emojis and code-mixed language has become essential in online communication. However, selecting the appropriate emoji that matches a particular sentiment or emotion in the code-mixed text can be difficult. This paper presents a novel task of predicting multiple emojis in English-Hindi code-mixed sentences and proposes a new dataset called SENTIMOJI, which extends the SemEval 2020 Task 9 SentiMix dataset. Our approach is based on exploiting the relationship between emotion, sentiment, and emojis to build an end-to-end framework. We replace the self-attention sublayers in the transformer encoder with simple linear transformations and use the RMS-layer norm instead of the normal layer norm. Moreover, we employ Gated Linear Unit and Fully Connected layers to predict emojis and identify the emotion and sentiment of a tweet. Our experimental results on the SENTIMOJI dataset demonstrate that the proposed multi-task framework outperforms the single-task framework. We also show that emojis are strongly linked to sentiment and emotion and that identifying sentiment and emotion can aid in accurately predicting the most suitable emoji. Our work contributes to the field of natural language processing and can help in the development of more effective tools for sentiment analysis and emotion recognition in code-mixed languages. The codes and data will be available at https://www.iitp.ac.in/~ai-nlp-ml/resources.html#SENTIMOJI to facilitate research.https://doi.org/10.1038/s41598-024-58944-5
spellingShingle Gopendra Vikram Singh
Soumitra Ghosh
Mauajama Firdaus
Asif Ekbal
Pushpak Bhattacharyya
Predicting multi-label emojis, emotions, and sentiments in code-mixed texts using an emojifying sentiments framework
Scientific Reports
title Predicting multi-label emojis, emotions, and sentiments in code-mixed texts using an emojifying sentiments framework
title_full Predicting multi-label emojis, emotions, and sentiments in code-mixed texts using an emojifying sentiments framework
title_fullStr Predicting multi-label emojis, emotions, and sentiments in code-mixed texts using an emojifying sentiments framework
title_full_unstemmed Predicting multi-label emojis, emotions, and sentiments in code-mixed texts using an emojifying sentiments framework
title_short Predicting multi-label emojis, emotions, and sentiments in code-mixed texts using an emojifying sentiments framework
title_sort predicting multi label emojis emotions and sentiments in code mixed texts using an emojifying sentiments framework
url https://doi.org/10.1038/s41598-024-58944-5
work_keys_str_mv AT gopendravikramsingh predictingmultilabelemojisemotionsandsentimentsincodemixedtextsusinganemojifyingsentimentsframework
AT soumitraghosh predictingmultilabelemojisemotionsandsentimentsincodemixedtextsusinganemojifyingsentimentsframework
AT mauajamafirdaus predictingmultilabelemojisemotionsandsentimentsincodemixedtextsusinganemojifyingsentimentsframework
AT asifekbal predictingmultilabelemojisemotionsandsentimentsincodemixedtextsusinganemojifyingsentimentsframework
AT pushpakbhattacharyya predictingmultilabelemojisemotionsandsentimentsincodemixedtextsusinganemojifyingsentimentsframework