Emotion Classification on Software Engineering Q&A Websites
Background: With the rapid proliferation of question-and-answer websites for software developers like Stack Overflow, there is an increasing need to discern developers’ emotions from their posts to assess the influence of these emotions on their productivity such as efficiency in bug fixing. Aim: W...
        Saved in:
      
    
          | Main Authors: | , | 
|---|---|
| Format: | Article | 
| Language: | English | 
| Published: | Wroclaw University of Science and Technology
    
        2024-11-01 | 
| Series: | e-Informatica Software Engineering Journal | 
| Subjects: | |
| Online Access: | https://www.e-informatyka.pl/EISEJ/papers/2025/1/4 | 
| Tags: | Add Tag 
      No Tags, Be the first to tag this record!
   | 
| Summary: | Background: With the rapid proliferation of question-and-answer websites
for software developers like Stack Overflow, there is an increasing need
to discern developers’ emotions from their posts to assess the influence of
these emotions on their productivity such as efficiency in bug fixing.
Aim: We aimed to develop a reliable emotion classification tool capable
of accurately categorizing emotions in Software Engineering (SE) websites
using data augmentation techniques to address the data scarcity problem
because previous research has shown that tools trained on other domains
can perform poorly when applied to SE domain directly.
Method: We utilized four machine learning techniques, namely BERT,
CodeBERT, RFC (Random Forest Classifier), and LSTM. Taking an innovative
approach to dataset augmentation, we employed word substitution,
back translation, and easy data augmentation methods. Using these we
developed sixteen unique emotion classification models: EmoClassBERT-
-Original, EmoClassRFC-Original, EmoClassLSTMOriginal, EmoClass-
CodeBERT-Original EmoClassLSTM-Substitution, EmoClassBERT-Substitution,
EmoClassRFC-Substitution, EmoClassCodeBERT-Substitution, Emo-
ClassBERT-Translation, EmoClassLSTM-Translation, EmoClassRFC-Translation,
EmoClassCodeBERT-Translation, EmoClassBERT-EDA, EmoClass-
LSTM-EDA, EmoClassCodeBERT-EDA, and EmoClassRFC-EDA. We
compared the performance of this model on a gold standard state-of-the-art
database and techniques (Multi-label SO BERT and EmoTxt).
Results: An initial investigation of models trained on the augmented
datasets demonstrated superior performance to those trained on the original
dataset. EmoClassLSTM-Substitution, EmoClassBERT-Substitution,
EmoClassCodeBERT-Substitution, and EmoClassRFC-Substitution models
show improvements of 13%, 5%, 5%, and 10% as compared to EmoClass-
LSTM-Original, EmoClassBERT-Original, EmoClassCodeBERT-Original,
and EmoClassRFC-Original, respectively, in average F1 score. The Emo-
ClassCodeBERT-Substitution performed the best and outperformed the
Multi-label SO BERT and Emotxt by 2.37% and 21.17%, respectively, in
average F1-score. A detailed investigation of the models on 100 runs of
the dataset shows that BERT-based and CodeBERT-based models gave
the best performance. This detailed investigation reveals no significant
differences in the performance of models trained on augmented datasets
and the original dataset on multiple runs of the dataset.
Conclusion: This research not only underlines the strengths and weaknesses
of each architecture but also highlights the pivotal role of data
augmentation in refining model performance, especially in the software
engineering domain. | 
|---|---|
| ISSN: | 1897-7979 2084-4840 | 
 
       